Re: [Pacemaker] How to build Pacemaker with Cman support?

2011-11-30 Thread Богомолов Дмитрий Викторович
Hello.
29 ноября 2011, 02:24 от Andrew Beekhof :
> 2011/11/28 Богомолов Дмитрий Викторович :
> > Thanks for your reply!
> >
> >
> > 28 ноября 2011, 03:54 от Andrew Beekhof :
> >> 2011/11/28 Богомолов Дмитрий Викторович :
> >> > Hello.
> >> > Addition. OS - Ubuntu 11.10
> >> > I have  installed libcman-dev, and know in config.log I can see
> >>
> >> I'm pretty sure the builds of pacemaker that come with ubuntu support
> >> cman already.
> > No it's not.
> > I have tried to upgrade from distributives: oneiric-proposed, 
> > ppa.launchpad.net/ubuntu-ha,
> > ppa.launchpad.net/ubuntu-ha-maintainers
> > There is no luck.
> > I post about it on ubuntu communiti forum, there is no answer.
> > http://ubuntuforums.org/showthread.php?t=1885340
> > And i found bug report log without answer
> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=639548
> >
> > That's why i trying now to build pacemaker from sources.
> >
> > I selected ubuntu because of simplicity and oneiric distr because of most 
> > recent.
> >
> > I want to get Xen VM on cluster, I have tried active/passive configuration, 
> > but it's not exactly what i need. So know i try to get active active 
> > configuration.
> >>
> >> >
> >> > configure:16634: checking for cman
> >> >
> >> > configure:16638: result: yes
>
> Ok, but you originally posted:
>
> configure:16634: checking for cman
>
> configure:16638: result: no
>
> So maybe something changed?
Yes. And i wrote about it.
First i tried to build this way:
aptitude build-dep pacemaker
apt-get source pacemaker
./autogen.sh
./configure --enable-fatal-warnings=no --with-cman 
--with-lcrso-dir=/usr/libexec/lcrso --prefix=/usr
./make
./make install
this way i get in config.log:
configure:16634: checking for cman
configure:16638: result: no

then i install libcman-dev,
./make clean
./autogen.sh
./configure --enable-fatal-warnings=no --with-cman 
--with-lcrso-dir=/usr/libexec/lcrso --prefix=/usr
./make
./make install
And now i get
configure:16634: checking for cman
configure:16638: result: yes

But, when i succesfully start cman, then start pacemaker, which failed to 
start, i get:
ERROR: read_config: Corosync configured for CMAN but this build of Pacemaker 
doesn't support it

>
> >> >
> >> > But, after :
> >> > make && make install
> >> > service pacemaker start
> >> > I still get this log event:
> >> > ERROR: read_config: Corosync configured for CMAN but this build of 
> >> > Pacemaker
> >> > doesn't support it
> >> > Please, help!
> >> >
> >> > Hello.
> >> >
> >> > I try to configure Active/Active cluster Cman+Pacemaker, that described
> >> > there:
> >> > http://www.clusterlabs.org/doc/en-US..._from_Scratch/
> >> > I set Cman, but when I start Pacemaker with this command:
> >> > $sudo service pacemaker start
> >> > I get this log event:
> >> > ERROR: read_config: Corosync configured for CMAN but this build of 
> >> > Pacemaker
> >> > doesn't support it
> >> >
> >> > Now I try to build Pacemaker with Cman.
> >> >
> >> > I follow instructions there http://www.clusterlabs.org/wiki/Install
> >> >
> >> > only difference for configuring Pacemaker:
> >> >
> >> > ./autogen.sh && ./configure --prefix=$PREFIX --with-lcrso-dir=$LCRSODIR
> >> > -with-cman=yes
> >> >
> >> > But after installing pacemaker, I have the same error.
> >> >
> >> > When I look on config.log, I can see this:
> >> >
> >> > configure:16634: checking for cman
> >> >
> >> > configure:16638: result: no
> >> >
> >> > So, help please, how to build pacemaker with cman support?
> >> >
> >> > ___
> >> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> >> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >> >
> >> > Project Home: http://www.clusterlabs.org
> >> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> > Bugs: http://bugs.clusterlabs.org
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > ___
> >> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> >> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >> >
> >> > Project Home: http://www.clusterlabs.org
> >> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> > Bugs: http://bugs.clusterlabs.org
> >> >
> >> >
> >>
> >
> 
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] CLVM & Pacemaker & Corosync on Ubuntu Omeiric Server

2011-11-30 Thread Vladislav Bogdanov
30.11.2011 14:08, Vadim Bulst wrote:
> Hello,
> 
> first of all I'd like to ask you a general question:
> 
> Does somebody successfully set up a clvm cluster with pacemaker and run
> it in productive mode?

I will say yes after I finally resolve remaining dlm&fencing issues.

> 
> Now back to the concrete problem:
> 
>  I configured two interfaces for corosync:
> 
> root@bbzclnode04:~# corosync-cfgtool -s
> Printing ring status.
> Local node ID 897624256
> RING ID 0
> id= 192.168.128.53
> status= ring 0 active with no faults
> RING ID 1
> id= 192.168.129.23
> status= ring 1 active with no faults
> 
> RRD set to passive
> 
> I also made some changes to my cib:
> 
> node bbzclnode04
> node bbzclnode06
> node bbzclnode07
> primitive clvm ocf:lvm2:clvmd \
> params daemon_timeout="30" \
> meta target-role="Started"

Please instruct clvmd to use corosync stack instead of openais (-I
corosync): otherwise it uses LCK service which is not mature and I
observed major problems with it.

> primitive dlm ocf:pacemaker:controld \
> meta target-role="Started"
> group dlm-clvm dlm clvm
> clone dlm-clvm-clone dlm-clvm \
> meta interleave="true" ordered="true"
> property $id="cib-bootstrap-options" \
> dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
> cluster-infrastructure="openais" \
> expected-quorum-votes="3" \
> no-quorum-policy="ignore" \
> stonith-enabled="false" \
> last-lrm-refresh="1322643084"
> 
> I cleaned and restarted the resources - nothing! :
> 
> crm(live)resource# cleanup dlm-clvm-clone
> Cleaning up dlm:0 on bbzclnode04
> Cleaning up dlm:0 on bbzclnode06
> Cleaning up dlm:0 on bbzclnode07
> Cleaning up clvm:0 on bbzclnode04
> Cleaning up clvm:0 on bbzclnode06
> Cleaning up clvm:0 on bbzclnode07
> Cleaning up dlm:1 on bbzclnode04
> Cleaning up dlm:1 on bbzclnode06
> Cleaning up dlm:1 on bbzclnode07
> Cleaning up clvm:1 on bbzclnode04
> Cleaning up clvm:1 on bbzclnode06
> Cleaning up clvm:1 on bbzclnode07
> Cleaning up dlm:2 on bbzclnode04
> Cleaning up dlm:2 on bbzclnode06
> Cleaning up dlm:2 on bbzclnode07
> Cleaning up clvm:2 on bbzclnode04
> Cleaning up clvm:2 on bbzclnode06
> Cleaning up clvm:2 on bbzclnode07
> Waiting for 19 replies from the CRMd... OK
> 
> crm_mon:
> 
> 
> Last updated: Wed Nov 30 10:15:09 2011
> Stack: openais
> Current DC: bbzclnode04 - partition with quorum
> Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
> 3 Nodes configured, 3 expected votes
> 1 Resources configured.
> 
> 
> Online: [ bbzclnode04 bbzclnode06 bbzclnode07 ]
> 
> 
> Failed actions:
> clvm:1_start_0 (node=bbzclnode06, call=11, rc=1, status=complete):
> unknown error
> clvm:0_start_0 (node=bbzclnode04, call=11, rc=1, status=complete):
> unknown error
> clvm:2_start_0 (node=bbzclnode07, call=11, rc=1, status=complete):
> unknown error
> 
> 
> When I look in the log - there is a message which tells me that may be
> another clvm process is already running - but it isn't so.
> 
> "clvmd could not create local socket Another clvmd is probably already
> running"
> 
> Or is it a permission problem - writing to the filesystem? Is there a
> way to get rid of it?

You can try to run it manually under strace. It will show you what happens.

> 
> Shell I use a different distro - our install from source?
> 
> 
> Am 24.11.2011 22:59, schrieb Andreas Kurz:
>> Hello,
>>
>> On 11/24/2011 10:12 PM, Vadim Bulst wrote:
>>> Hi Andreas,
>>>
>>> I changed my cib:
>>>
>>> node bbzclnode04
>>> node bbzclnode06
>>> node bbzclnode07
>>> primitive clvm ocf:lvm2:clvmd \
>>> params daemon_timeout="30"
>>> primitive dlm ocf:pacemaker:controld
>>> group g_lock dlm clvm
>>> clone g_lock-clone g_lock \
>>> meta interleave="true"
>>> property $id="cib-bootstrap-options" \
>>> dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
>>> cluster-infrastructure="openais" \
>>> expected-quorum-votes="3" \
>>> no-quorum-policy="ignore" \
>>> stonith-enabled="false" \
>>> last-lrm-refresh="1322049979
>>>
>>> but no luck at all.
>> I assume you did at least a cleanup on clvm and it still does not work
>> ... next step would be to grep for ERROR in your cluster log and look
>> for other suspicious messages to find out why clvm is not that motivated
>> to start.
>>
>>> "And use Corosync 1.4.x with redundant rings and automatic ring recovery
>>> feature enabled."
>>>
>>> I got two interfaces per server - there are bonded together and bridged
>>> for virtualization.  Only one untagged vlan. I tried to give a tagged
>>> Vlan Bridge a Address but didn't worked. My network conf looks like that:
>> One ore two extra nics are quite affordable today to build e.g. a direct
>> connection between the nodes (if possible)
>>
>> Regards,
>> Andreas
>>
>>
>>
>> ___
>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.or

Re: [Pacemaker] CLVM & Pacemaker & Corosync on Ubuntu Omeiric Server

2011-11-30 Thread Vadim Bulst

Am 30.11.2011 12:22, schrieb Vladislav Bogdanov:

30.11.2011 14:08, Vadim Bulst wrote:

Hello,

first of all I'd like to ask you a general question:

Does somebody successfully set up a clvm cluster with pacemaker and run
it in productive mode?

I will say yes after I finally resolve remaining dlm&fencing issues.


Now back to the concrete problem:

  I configured two interfaces for corosync:

root@bbzclnode04:~# corosync-cfgtool -s
Printing ring status.
Local node ID 897624256
RING ID 0
 id= 192.168.128.53
 status= ring 0 active with no faults
RING ID 1
 id= 192.168.129.23
 status= ring 1 active with no faults

RRD set to passive

I also made some changes to my cib:

node bbzclnode04
node bbzclnode06
node bbzclnode07
primitive clvm ocf:lvm2:clvmd \
 params daemon_timeout="30" \
 meta target-role="Started"

Please instruct clvmd to use corosync stack instead of openais (-I
corosync): otherwise it uses LCK service which is not mature and I
observed major problems with it.


primitive dlm ocf:pacemaker:controld \
 meta target-role="Started"
group dlm-clvm dlm clvm
clone dlm-clvm-clone dlm-clvm \
 meta interleave="true" ordered="true"
property $id="cib-bootstrap-options" \
 dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
 cluster-infrastructure="openais" \
 expected-quorum-votes="3" \
 no-quorum-policy="ignore" \
 stonith-enabled="false" \
 last-lrm-refresh="1322643084"

I cleaned and restarted the resources - nothing! :

crm(live)resource# cleanup dlm-clvm-clone
Cleaning up dlm:0 on bbzclnode04
Cleaning up dlm:0 on bbzclnode06
Cleaning up dlm:0 on bbzclnode07
Cleaning up clvm:0 on bbzclnode04
Cleaning up clvm:0 on bbzclnode06
Cleaning up clvm:0 on bbzclnode07
Cleaning up dlm:1 on bbzclnode04
Cleaning up dlm:1 on bbzclnode06
Cleaning up dlm:1 on bbzclnode07
Cleaning up clvm:1 on bbzclnode04
Cleaning up clvm:1 on bbzclnode06
Cleaning up clvm:1 on bbzclnode07
Cleaning up dlm:2 on bbzclnode04
Cleaning up dlm:2 on bbzclnode06
Cleaning up dlm:2 on bbzclnode07
Cleaning up clvm:2 on bbzclnode04
Cleaning up clvm:2 on bbzclnode06
Cleaning up clvm:2 on bbzclnode07
Waiting for 19 replies from the CRMd... OK

crm_mon:


Last updated: Wed Nov 30 10:15:09 2011
Stack: openais
Current DC: bbzclnode04 - partition with quorum
Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
3 Nodes configured, 3 expected votes
1 Resources configured.


Online: [ bbzclnode04 bbzclnode06 bbzclnode07 ]


Failed actions:
 clvm:1_start_0 (node=bbzclnode06, call=11, rc=1, status=complete):
unknown error
 clvm:0_start_0 (node=bbzclnode04, call=11, rc=1, status=complete):
unknown error
 clvm:2_start_0 (node=bbzclnode07, call=11, rc=1, status=complete):
unknown error


When I look in the log - there is a message which tells me that may be
another clvm process is already running - but it isn't so.

"clvmd could not create local socket Another clvmd is probably already
running"

Or is it a permission problem - writing to the filesystem? Is there a
way to get rid of it?

You can try to run it manually under strace. It will show you what happens.



Here we go:

root@bbzclnode07:~# strace clvmd -d  -I cororsync
execve("/usr/sbin/clvmd", ["clvmd", "-d", "-I", "cororsync"], [/* 18 vars */]) 
= 0
brk(0)  = 0x12f7000
access("/etc/ld.so.nohwcap", F_OK)  = -1 ENOENT (No such file or directory)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x7f9f09dad000
access("/etc/ld.so.preload", R_OK)  = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)  = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=25864, ...}) = 0
mmap(NULL, 25864, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f9f09da6000
close(3)= 0
access("/etc/ld.so.nohwcap", F_OK)  = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libdl.so.2", O_RDONLY) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\340\r\0\0\0\0\0\0"..., 
832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=14768, ...}) = 0
mmap(NULL, 2109704, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x7f9f0998b000
mprotect(0x7f9f0998d000, 2097152, PROT_NONE) = 0
mmap(0x7f9f09b8d000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 
0x7f9f09b8d000

close(3)= 0
access("/etc/ld.so.nohwcap", F_OK)  = -1 ENOENT (No such file or directory)
open("/lib/libdevmapper-event.so.1.02.1", O_RDONLY) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 \24\0\0\0\0\0\0"..., 
832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=18704, ...}) = 0
mmap(NULL, 2113872, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x7f9f09786000
mprotect(0x7f9f0978a000, 2093056, PROT_NONE) = 0
mmap(0x7f9f09989000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3000) = 
0x7f9f09989000

close(3)   

Re: [Pacemaker] CLVM & Pacemaker & Corosync on Ubuntu Omeiric Server

2011-11-30 Thread Vadim Bulst

@Vladislav

Where and how can I set the switch for the cluster manager if it runs as a 
resource.


Am 30.11.2011 13:10, schrieb Vadim Bulst:

Am 30.11.2011 12:22, schrieb Vladislav Bogdanov:

30.11.2011 14:08, Vadim Bulst wrote:

Hello,

first of all I'd like to ask you a general question:

Does somebody successfully set up a clvm cluster with pacemaker and run
it in productive mode?

I will say yes after I finally resolve remaining dlm&fencing issues.


Now back to the concrete problem:

  I configured two interfaces for corosync:

root@bbzclnode04:~# corosync-cfgtool -s
Printing ring status.
Local node ID 897624256
RING ID 0
 id= 192.168.128.53
 status= ring 0 active with no faults
RING ID 1
 id= 192.168.129.23
 status= ring 1 active with no faults

RRD set to passive

I also made some changes to my cib:

node bbzclnode04
node bbzclnode06
node bbzclnode07
primitive clvm ocf:lvm2:clvmd \
 params daemon_timeout="30" \
 meta target-role="Started"

Please instruct clvmd to use corosync stack instead of openais (-I
corosync): otherwise it uses LCK service which is not mature and I
observed major problems with it.


primitive dlm ocf:pacemaker:controld \
 meta target-role="Started"
group dlm-clvm dlm clvm
clone dlm-clvm-clone dlm-clvm \
 meta interleave="true" ordered="true"
property $id="cib-bootstrap-options" \
 dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
 cluster-infrastructure="openais" \
 expected-quorum-votes="3" \
 no-quorum-policy="ignore" \
 stonith-enabled="false" \
 last-lrm-refresh="1322643084"

I cleaned and restarted the resources - nothing! :

crm(live)resource# cleanup dlm-clvm-clone
Cleaning up dlm:0 on bbzclnode04
Cleaning up dlm:0 on bbzclnode06
Cleaning up dlm:0 on bbzclnode07
Cleaning up clvm:0 on bbzclnode04
Cleaning up clvm:0 on bbzclnode06
Cleaning up clvm:0 on bbzclnode07
Cleaning up dlm:1 on bbzclnode04
Cleaning up dlm:1 on bbzclnode06
Cleaning up dlm:1 on bbzclnode07
Cleaning up clvm:1 on bbzclnode04
Cleaning up clvm:1 on bbzclnode06
Cleaning up clvm:1 on bbzclnode07
Cleaning up dlm:2 on bbzclnode04
Cleaning up dlm:2 on bbzclnode06
Cleaning up dlm:2 on bbzclnode07
Cleaning up clvm:2 on bbzclnode04
Cleaning up clvm:2 on bbzclnode06
Cleaning up clvm:2 on bbzclnode07
Waiting for 19 replies from the CRMd... OK

crm_mon:


Last updated: Wed Nov 30 10:15:09 2011
Stack: openais
Current DC: bbzclnode04 - partition with quorum
Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
3 Nodes configured, 3 expected votes
1 Resources configured.


Online: [ bbzclnode04 bbzclnode06 bbzclnode07 ]


Failed actions:
 clvm:1_start_0 (node=bbzclnode06, call=11, rc=1, status=complete):
unknown error
 clvm:0_start_0 (node=bbzclnode04, call=11, rc=1, status=complete):
unknown error
 clvm:2_start_0 (node=bbzclnode07, call=11, rc=1, status=complete):
unknown error


When I look in the log - there is a message which tells me that may be
another clvm process is already running - but it isn't so.

"clvmd could not create local socket Another clvmd is probably already
running"

Or is it a permission problem - writing to the filesystem? Is there a
way to get rid of it?

You can try to run it manually under strace. It will show you what happens.



Here we go:

root@bbzclnode07:~# strace clvmd -d  -I cororsync
execve("/usr/sbin/clvmd", ["clvmd", "-d", "-I", "cororsync"], [/* 18 vars */]) 
= 0
brk(0)  = 0x12f7000
access("/etc/ld.so.nohwcap", F_OK)  = -1 ENOENT (No such file or directory)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x7f9f09dad000
access("/etc/ld.so.preload", R_OK)  = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)  = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=25864, ...}) = 0
mmap(NULL, 25864, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f9f09da6000
close(3)= 0
access("/etc/ld.so.nohwcap", F_OK)  = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libdl.so.2", O_RDONLY) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\340\r\0\0\0\0\0\0", 
832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=14768, ...}) = 0
mmap(NULL, 2109704, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x7f9f0998b000
mprotect(0x7f9f0998d000, 2097152, PROT_NONE) = 0
mmap(0x7f9f09b8d000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 
0x7f9f09b8d000

close(3)= 0
access("/etc/ld.so.nohwcap", F_OK)  = -1 ENOENT (No such file or directory)
open("/lib/libdevmapper-event.so.1.02.1", O_RDONLY) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 \24\0\0\0\0\0\0"..., 
832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=18704, ...}) = 0
mmap(NULL, 2113872, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x7f9f09786000
mprotect(0x7f9f0978a000, 2093056, PR

[Pacemaker] lrmd hanging

2011-11-30 Thread coredump
So last night I was supposed to get a cluster running, everything
worked ok on a virtual environment using the same software and by my
experience I only had to install pacemaker and corosync (from the
ubuntu 10.04 ppa) and get it rolling. What really happened was: I
could use crm configure to set properties to the cluster like resource
stickiness and quorum and disable stonith. When I tried to add
primitives, the crm just hang there, without returning an error or
completing.
I noticed those two entries in the log, everytime crm tries to
configure something the first time:

Nov 30 05:33:26 server lrmd: [18102]: debug: on_msg_register:client
lrmadmin [18159] registered
Nov 30 05:33:26 server lrmd: [18102]: debug: on_receive_cmd: the IPC
to client [pid:18159] disconnected.

Also, when I stop corosync it sends a TERM signal for lrmd but it
doesn't exit, even after some minutes, I have to kill -9 it. I tried
to strace lrmd but it's stuck on a FUTEX that really doesn't really
help a lot:

Process 32764 attached - interrupt to quit
futex(0xe070d8, FUTEX_WAIT_PRIVATE, 2, NULL^C 

Anyone has any idea what would make lrmd to just hang?

[]s
core

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Strange failover behaviour SOLVED

2011-11-30 Thread Hans Lammerts
Hi Andreas,

Thank you for your answer.

Didn't know about the instability issues with DRBD 8.4.
The reason why I compiled everything myself is the fact that the versions that 
are shipped with CentOS
6 had some problems as well, like heartbeatprocesses taking up 100% cpu. Don't 
have this now

I did try switching to the ocf way of stopping/starting httpd, and this seems 
to work perfectly.

Thanks again, another problem solved.

Hans

-Original Message-
From: Andreas Kurz [mailto:andr...@hastexo.com] 
Sent: Wednesday, November 30, 2011 00:12
To: pacemaker@oss.clusterlabs.org
Subject: Re: [Pacemaker] Strange failover behaviour

On 11/29/2011 07:14 PM, Hans Lammerts wrote:
> Hi there,
> 
>  
> 
> I have something strange I would like the community to give it’s 
> opinion on. I can’t figure out
> 
> what is going wrong.
> 
>  
> 
> I have a 2 node cluster (named cl1 and cl2). On this cluster I’m 
> running MySQL, Apache, and
> 
> Zarafa. Both clusters run CentOS 6.
> 
> I have downloaded all latest sources for DRBD, Cluster Glue, Resource 
> Agents, Heartbeat
> 
> and Pacemaker and compiled them. Everything seems to be OK.

BTW ... no need to compile Pacemaker/Glue/Agents ... it is shipped with CentOS 
6 ... and use DRBD 8.4.0 only for test setups, there are some known stability 
issues.

> 
>  
> 
> I believe my Pacemaker setup to be OK, but I may be mistaken. Will 
> attach the config below.
> 
>  
> 
> What I experience when I do a failover from cl1 to cl2 is that MySQL 
> and Zarafa failover without
> 
> any problems, but httpd seems to be getting in a loop of starting and 
> stopping.
> 
> The error that is displayed is this :
> 
>  
> 
> apache2_monitor_1 (node=cl2, call=502, rc=7, status=complete): not 
> running
> 

the cluster and apache logs should give you good hints on the problem ...

>  
> 
> If I remember to set the failcount of the apache2 resource to 0, httpd 
> will eventually start after
> 
> quite a number of retries :
> 
>  
> 
> [root@cl2 httpd]# crm resource failcount apache2 show cl2
> 
> scope=status  name=fail-count-apache2 value=69
> 
>  
> 
> If I forget to reset the failcount (something you should not need to 
> do), the failcount will reach
> 
> infinity at some time in the future, and httpd won’t start. The number 
> of times Pacemaker
> 
> retries Is also different every time.
> 
>  
> 
> Wait, it gets stranger…
> 
> Putting cl1 online again, the fallback is initiated, and this goes 
> without any problems. So, it looks
> 
> like the problems reside only on the second cluster half. The hardware 
> of cl2 is different from cl1, and
> 
> it is the slower machine of the two.
> 
> Yes, I made very sure every configuration file is the same on both nodes.
> 
> And yes, I made sure the server-status section in httpd.conf is 
> uncommented, as is the
> 
> ExtendedStatus directive. Doing a wget -O - 
> http://localhost/server-status?auto works

you are using the lsb script ... this does a simple pid check, at least on the 
SL 6.1 test machines in my lab. Have you tried the ocf RA?

> 
> perfectly.
> 
>  
> 
> Can anyone please tell me what the problem could be here ?

Dig through your logs ... or hire someone to do it for you ;-)

Regards,
Andreas

--
Need help with Pacemaker?
http://www.hastexo.com/now

> 
> Thanks.
> 
>  
> 
> Versioninfo:
> 
> CentOS 6.0
> 
> DRBD 8.4.0
> Glue 1.0.8
> 
> Resource agents 3.9.2
> 
> Heartbeat 3.0.5
> 
> Pacemaker 1.0.11
> 
>  
> 
> Pacemaker config:
> 
>  
> 
> node $id="62b94e0a-532f-4f99-acdb-57d6052a5635" cl1 \
> 
> attributes standby="on"
> 
> node $id="7444dfb4-2c9b-4130-83c4-c0cd3d7ec006" cl2 \
> 
> attributes standby="off"
> 
> primitive apache2 lsb:httpd \
> 
> op monitor interval="10" timeout="30" \
> 
> op start interval="0" timeout="120" \
> 
> op stop interval="0" timeout="120" \
> 
> meta target-role="Started"
> 
> primitive drbd_http ocf:linbit:drbd \
> 
> params drbd_resource="http" \
> 
> op start interval="0" timeout="240" \
> 
> op stop interval="0" timeout="100" \
> 
> op monitor interval="59s" role="Master" timeout="30s" \
> 
> op monitor interval="60s" role="Slave" timeout="30s"
> 
> primitive drbd_mysql ocf:linbit:drbd \
> 
> params drbd_resource="mysql" \
> 
> op start interval="0" timeout="240" \
> 
> op stop interval="0" timeout="100" \
> 
> op monitor interval="59s" role="Master" timeout="30s" \
> 
> op monitor interval="60s" role="Slave" timeout="30s"
> 
> primitive drbd_zarafa ocf:linbit:drbd \
> 
> params drbd_resource="zarafa" \
> 
> op start interval="0" timeout="240" \
> 
> op stop interval="0" timeout="100" \
> 
> op monitor interval="59s" role="Master" timeout="30s" \
> 
> op monitor interval="60s" role="Slave" timeout="30s"
> 
> primitive http_fs ocf:heartbeat:Filesystem \
> 
> params device="/dev/drbd1" directory="/var/www/html

Re: [Pacemaker] Regarding Stonith RAs

2011-11-30 Thread neha chatrath
Hello Andreas,

Pacemaker is not built with Heartbeat support on RHEL-6 and its derivatives.
How do I check this and what steps do I need to take to resolve this issue.

Thanks and regards
Neha Chatrath

On Thu, Nov 24, 2011 at 5:38 PM, neha chatrath wrote:

> Hello,
>
> I could get list of Stontih RAs by installing cman, clvm, ricci,
> pacemaker, rgmanages RPMs provided by CentOS 6 distribution.
> But unfortunately after installing these packages, all the process related
> to Pacemaker are not coming up on starting Heartbeat Deamon.
> When I start Heartbeat daemon, only following process are started:
>


> Pacemaker is not built with Heartbeat support on RHEL-6 and its
> derivatives.
> root@p init.d]# ps -eaf |grep heartbeat
> root  3522 1  0 17:26 ?00:00:00 heartbeat: master control
> process
> root  3525  3522  0 17:26 ?00:00:00 heartbeat: FIFO
> reader
> root  3526  3522  0 17:26 ?00:00:00 heartbeat: write: bcast
> eth1
> root  3527  3522  0 17:26 ?00:00:00 heartbeat: read: bcast
> eth1
> root  3538  3381  0 17:26 pts/300:00:00 grep heartbeat
>
> In the log messages, following error logs are observed:
> "Nov 24 17:26:19 p heartbeat: [3522]: debug: Signing on API client 3539
> (ccm)
> Nov 24 17:26:19 p ccm: [3539]: info: Hostname: p
> Nov 24 17:26:19 p attrd: [3543]: info: Invoked: /usr/lib/heartbeat/attrd
> Nov 24 17:26:19 p stonith-ng: [3542]: info: Invoked:
> /usr/lib/heartbeat/stonithd
> Nov 24 17:26:19 p cib: [3540]: info: Invoked: /usr/lib/heartbeat/cib
> *Nov 24 17:26:19 p lrmd: [3541]: ERROR: socket_wait_conn_new: trying to
> create in /var/run/heartbeat/lrm_cmd_sock bind:: No such file or directory
> *
> Nov 24 17:26:19 p lrmd: [3541]: ERROR: main: can not create wait
> connection for command.
> Nov 24 17:26:19 p lrmd: [3541]: ERROR: Startup aborted (can't create comm
> channel).  Shutting down.
> Nov 24 17:26:19 p heartbeat: [3522]: WARN: Managed /usr/lib/heartbeat/lrmd
> -r process 3541 exited with return code 100.
> Nov 24 17:26:19 p heartbeat: [3522]: ERROR: Client /usr/lib/heartbeat/lrmd
> -r exited with return code 100.
> Nov 24 17:26:19 p attrd: [3543]: info: crm_log_init_worker: Changed active
> directory to /var/lib/heartbeat/cores/hacluster
> Nov 24 17:26:19 p attrd: [3543]: info: main: Starting up
> Nov 24 17:26:19 p stonith-ng: [3542]: info: crm_log_init_worker: Changed
> active directory to /var/lib/heartbeat/cores/root
> Nov 24 17:26:19 p cib: [3540]: info: crm_log_init_worker: Changed active
> directory to /var/lib/heartbeat/cores/hacluster
> Nov 24 17:26:19 p attrd: [3543]: CRIT: get_cluster_type: This installation
> of Pacemaker does not support the '(null)' cluster infrastructure.
> Terminating.
> Nov 24 17:26:19 p stonith-ng: [3542]: CRIT: get_cluster_type: This
> installation of Pacemaker does not support the '(null)' cluster
> infrastructure.  Terminating.
> Nov 24 17:26:19 p heartbeat: [3522]: WARN: Managed
> /usr/lib/heartbeat/attrd process 3543 exited with return code 100.
> Nov 24 17:26:19 p heartbeat: [3522]: ERROR: Client
> /usr/lib/heartbeat/attrd exited with return code 100.
> Nov 24 17:26:19 p heartbeat: [3522]: info: the send queue length from
> heartbeat to client ccm is set to 1024
> Nov 24 17:26:19 p heartbeat: [3522]: WARN: Managed
> /usr/lib/heartbeat/stonithd process 3542 exited with return code 100.
> Nov 24 17:26:19 p heartbeat: [3522]: ERROR: Client
> /usr/lib/heartbeat/stonithd exited with return code 100.
> *Nov 24 17:26:19 p cib: [3540]: info: retrieveCib: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
> /var/lib/heartbeat/crm/cib.xml.sig)*
> Nov 24 17:26:19 p cib: [3540]: debug: log_data_element: readCibXmlFile:
> [on-disk]  validate-with="pacemaker-1.2" cib-last-written="Mon Nov 21 11:09:22 2011" >
> ...
> 
> Nov 24 17:26:19 p crmd: [3544]: info: crmd_init: Starting crmd
> Nov 24 17:26:19 p crmd: [3544]: debug: s_crmd_fsa: Processing I_STARTUP: [
> state=S_STARTING cause=C_STARTUP origin=crmd_init ]
> Nov 24 17:26:19 p crmd: [3544]: debug: do_fsa_action: actions:trace://
> A_LOG
> Nov 24 17:26:19 p crmd: [3544]: debug: do_fsa_action: actions:trace://
> A_STARTUP
> Nov 24 17:26:19 p crmd: [3544]: debug: do_startup: Registering Signal
> Handlers
> Nov 24 17:26:19 p crmd: [3544]: debug: do_startup: Creating CIB and LRM
> objects
> Nov 24 17:26:19 p crmd: [3544]: debug: do_fsa_action: actions:trace://
> A_CIB_START
> Nov 24 17:26:19 p crmd: [3544]: debug: init_client_ipc_comms_nodispatch:
> Attempting to talk on: /var/run/crm/cib_rw
> Nov 24 17:26:19 p crmd: [3544]: debug: init_client_ipc_comms_nodispatch:
> Could not init comms on: /var/run/crm/cib_rw
> Nov 24 17:26:19 p crmd: [3544]: debug: cib_native_signon_raw: Connection
> to command channel failed
> Nov 24 17:26:19 p crmd: [3544]: debug: init_client_ipc_comms_nodispatch:
> Attempting to talk on: /var/run/crm/cib_callback
> Nov 24 17:26:19 p crmd: [3544]: debug: init_client_ipc_comms_nodispatch

Re: [Pacemaker] 2 node cluster questions

2011-11-30 Thread Hellemans Dirk D
Hi Mark,

 

Thanks for your help! Indeed, you get at race condition... that’s why an 
external quorum daemon (such as the one supplied with HP’s ServiceGuard) would 
be nice. It looks like this is what Linux HA is heading for 
(http://theclusterguy.clusterlabs.org/post/907043024/introducing-the-pacemaker-master-control-process-for,
 setup number 4) but it’s not there yet.

 

The only way to do it with SBD is to avoid auto startup of a cluster node (e.g. 
disable corosync and pacemaker init scripts) --> avoids fencing the other after 
being fenced. Or to use SBD on iSCSI storage... if you have no network 
connection, you cannot fence and sdb will make watchdog timeout --> the node 
which lost network connectivity is going to be reset. If network connectivity 
is still missing at reboot, it cannot fence the other... otherwise it’ll join 
the cluster. Any flaw here? That is +/- the same as with a quorum server... if 
it cannot reach the quorum server, the node can startup but should not fence 
the other (which has quorum).

 

Anyway, I must admit: a quorum server or 3rd node seems safer by design.

 

Rgds, Dirk

 

From: mark - pacemaker list [mailto:m+pacema...@nerdish.us] 
Sent: Friday, November 25, 2011 8:27 PM
To: The Pacemaker cluster resource manager
Subject: Re: [Pacemaker] 2 node cluster questions

 

Hi Dirk,

 

On Fri, Nov 25, 2011 at 6:05 AM, Hellemans Dirk D  
wrote:

Hello everyone,

 

I’ve been reading a lot lately about using Corosync/Openais in combination with 
Pacemaker: SuSe Linux documentation, Pacemaker & Linux-ha website, interesting 
blogs, mailinglists, etc. As I’m particularly interested in how well two node 
clusters (located within the same server room) are handled, I was a bit 
confused by the fact that quorum disks/ quorum servers are (not yet?) 
supported/used. Some suggested to add a third node which is not actively 
participating (e.g. only running corosync or with hearbeat but in standby 
mode). That might be a solution but doesn’t “feel” right, especially if you 
consider multiple two-node clusters... that would require a lot of extra 
“quorum only nodes”. Somehow SBD (storage based death) in combination with a 
hardware watchdog timer seemed to also provide a solution: run it on top of 
iSCSI storage and you end up with a fencing device and some sort of “network 
based quorum” as tiebreaker. If one node loses network connectivity, sbd + 
watchdog will make sure it’s being fenced.

 

I’d love to hear your ideas about 2 node cluster setups. What is the best way 
to do it? Any chance we’ll get quorum disks/ quorum servers in the (near) 
future?

 

 

 

Our experience with a two-node SBD-based cluster wasn't good.  After setup, we 
started on failure scenarios.  The first test was to drop network connectivity 
for one of the nodes while both could still access storage.  The nodes fenced 
each other (sort of like a STONITH deathmatch you can read about), killing all 
services and leaving us waiting for both nodes to boot back up.  Obviously, a 
complete failure of testing, we didn't even proceed with further checks.  We 
took a standard PC and built it out as a third node, giving the cluster true 
quorum, and now it's rock-solid and absolutely correct in every failure 
scenario we throw at it.  For production use, the very real possibility of two 
nodes killing each other just wasn't worth the risk to us.

 

If you go with two nodes and SBD, do a lot of testing.  No matter how much you 
test though, if they lose visibility to each other on the network but can both 
still see the storage, you've got a race where the node that *should* be fenced 
(the one that has its network cables disconnected) can fence the node that is 
still 100% healthy and actively serving clients.

 

Maybe there's a way to configure around that, I'd be interested in hearing how 
if so.

 

Regards,

Mark

 

 

 

 

In addition, say you’re not using sbd but an IPMI based fencing 
solution. You lose network connectivity on one of the nodes (I know, they’re 
redundant but still...sh*t happens ;) Does Pacemaker know which of both nodes 
lost network connectivity? E.g.: node 1 runs Oracle database, node 2 nothing. 
Node 2 loses network connectivity (e.g. both NICs without signal because 
unplugged by an errant technician ;) )... => split brain situation occurs, but 
who’ll be fenced? The one with Oracle running ?? I really hope not... cause in 
this case, the cluster can “see” there’s no signal on the NICs of node2. Would 
be interesting to know more about how Pacemaker/corosync makes such kind of 
decisions... how to choose which one will be fenced in case of split brain. Is 
it randomly chosen? Is it the DC which decides? Based on NIC state? I did some 
quick testing with 2 VMs and at first, it looks like Pacemaker/corosync always 
fences the correct node, or: the node where I unplugged the “virtual” cable. 

 

I’m curious!

 

Thanks a lot!

 

Re: [Pacemaker] How to build Pacemaker with Cman support?

2011-11-30 Thread Nick Khamis
Could you show the output of:

pacemakerd --features

Please make sure that you don't have pcmk file in /corosync/service.d/

Cheers,

Nick.

2011/11/30 Богомолов Дмитрий Викторович :
> Hello.
> 29 ноября 2011, 02:24 от Andrew Beekhof :
>> 2011/11/28 Богомолов Дмитрий Викторович :
>> > Thanks for your reply!
>> >
>> >
>> > 28 ноября 2011, 03:54 от Andrew Beekhof :
>> >> 2011/11/28 Богомолов Дмитрий Викторович :
>> >> > Hello.
>> >> > Addition. OS - Ubuntu 11.10
>> >> > I have  installed libcman-dev, and know in config.log I can see
>> >>
>> >> I'm pretty sure the builds of pacemaker that come with ubuntu support
>> >> cman already.
>> > No it's not.
>> > I have tried to upgrade from distributives: oneiric-proposed, 
>> > ppa.launchpad.net/ubuntu-ha,
>> > ppa.launchpad.net/ubuntu-ha-maintainers
>> > There is no luck.
>> > I post about it on ubuntu communiti forum, there is no answer.
>> > http://ubuntuforums.org/showthread.php?t=1885340
>> > And i found bug report log without answer
>> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=639548
>> >
>> > That's why i trying now to build pacemaker from sources.
>> >
>> > I selected ubuntu because of simplicity and oneiric distr because of most 
>> > recent.
>> >
>> > I want to get Xen VM on cluster, I have tried active/passive 
>> > configuration, but it's not exactly what i need. So know i try to get 
>> > active active configuration.
>> >>
>> >> >
>> >> > configure:16634: checking for cman
>> >> >
>> >> > configure:16638: result: yes
>>
>> Ok, but you originally posted:
>>
>> configure:16634: checking for cman
>>
>> configure:16638: result: no
>>
>> So maybe something changed?
> Yes. And i wrote about it.
> First i tried to build this way:
> aptitude build-dep pacemaker
> apt-get source pacemaker
> ./autogen.sh
> ./configure --enable-fatal-warnings=no --with-cman 
> --with-lcrso-dir=/usr/libexec/lcrso --prefix=/usr
> ./make
> ./make install
> this way i get in config.log:
> configure:16634: checking for cman
> configure:16638: result: no
>
> then i install libcman-dev,
> ./make clean
> ./autogen.sh
> ./configure --enable-fatal-warnings=no --with-cman 
> --with-lcrso-dir=/usr/libexec/lcrso --prefix=/usr
> ./make
> ./make install
> And now i get
> configure:16634: checking for cman
> configure:16638: result: yes
>
> But, when i succesfully start cman, then start pacemaker, which failed to 
> start, i get:
> ERROR: read_config: Corosync configured for CMAN but this build of Pacemaker 
> doesn't support it
>
>>
>> >> >
>> >> > But, after :
>> >> > make && make install
>> >> > service pacemaker start
>> >> > I still get this log event:
>> >> > ERROR: read_config: Corosync configured for CMAN but this build of 
>> >> > Pacemaker
>> >> > doesn't support it
>> >> > Please, help!
>> >> >
>> >> > Hello.
>> >> >
>> >> > I try to configure Active/Active cluster Cman+Pacemaker, that described
>> >> > there:
>> >> > http://www.clusterlabs.org/doc/en-US..._from_Scratch/
>> >> > I set Cman, but when I start Pacemaker with this command:
>> >> > $sudo service pacemaker start
>> >> > I get this log event:
>> >> > ERROR: read_config: Corosync configured for CMAN but this build of 
>> >> > Pacemaker
>> >> > doesn't support it
>> >> >
>> >> > Now I try to build Pacemaker with Cman.
>> >> >
>> >> > I follow instructions there http://www.clusterlabs.org/wiki/Install
>> >> >
>> >> > only difference for configuring Pacemaker:
>> >> >
>> >> > ./autogen.sh && ./configure --prefix=$PREFIX --with-lcrso-dir=$LCRSODIR
>> >> > -with-cman=yes
>> >> >
>> >> > But after installing pacemaker, I have the same error.
>> >> >
>> >> > When I look on config.log, I can see this:
>> >> >
>> >> > configure:16634: checking for cman
>> >> >
>> >> > configure:16638: result: no
>> >> >
>> >> > So, help please, how to build pacemaker with cman support?
>> >> >
>> >> > ___
>> >> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>> >> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> >> >
>> >> > Project Home: http://www.clusterlabs.org
>> >> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> >> > Bugs: http://bugs.clusterlabs.org
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > ___
>> >> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>> >> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> >> >
>> >> > Project Home: http://www.clusterlabs.org
>> >> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> >> > Bugs: http://bugs.clusterlabs.org
>> >> >
>> >> >
>> >>
>> >
>>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Pacemaker ma

Re: [Pacemaker] lrmd hanging

2011-11-30 Thread Dejan Muhamedagic
Hi,

On Wed, Nov 30, 2011 at 11:16:40AM -0200, coredump wrote:
> So last night I was supposed to get a cluster running, everything
> worked ok on a virtual environment using the same software and by my
> experience I only had to install pacemaker and corosync (from the
> ubuntu 10.04 ppa) and get it rolling. What really happened was: I
> could use crm configure to set properties to the cluster like resource
> stickiness and quorum and disable stonith. When I tried to add
> primitives, the crm just hang there, without returning an error or
> completing.
> I noticed those two entries in the log, everytime crm tries to
> configure something the first time:
> 
> Nov 30 05:33:26 server lrmd: [18102]: debug: on_msg_register:client
> lrmadmin [18159] registered
> Nov 30 05:33:26 server lrmd: [18102]: debug: on_receive_cmd: the IPC
> to client [pid:18159] disconnected.
> 
> Also, when I stop corosync it sends a TERM signal for lrmd but it
> doesn't exit, even after some minutes, I have to kill -9 it. I tried
> to strace lrmd but it's stuck on a FUTEX that really doesn't really
> help a lot:
> 
> Process 32764 attached - interrupt to quit
> futex(0xe070d8, FUTEX_WAIT_PRIVATE, 2, NULL^C 
> 
> Anyone has any idea what would make lrmd to just hang?

It's probably support for the ubuntu specific init system. That
bug (in glib) has been fixed but I don't know if there are fixed
packages. Though cluster-glue apparently doesn't need to be
updated, only glib. Best to open a bug report with ubuntu.

Thanks,

Dejan

> []s
> core
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] CLVM & Pacemaker & Corosync on Ubuntu Omeiric Server

2011-11-30 Thread Vladislav Bogdanov
30.11.2011 15:51, Vadim Bulst wrote:
> @Vladislav
> 
> Where and how can I set the switch for the cluster manager if it runs as
> a resource.

Ahm, I use my own RA for clvmd, and don't remember if upstream has that
possibility.

Please find attached.
I can't say it is perfect, it is just a quick hack over accidentally
found one, but it does its function for me.
Set 'avoid_lck' to 1.

Best,
Vladislav

> 
> 
> Am 30.11.2011 13:10, schrieb Vadim Bulst:
>> Am 30.11.2011 12:22, schrieb Vladislav Bogdanov:
>>> 30.11.2011 14:08, Vadim Bulst wrote:
 Hello,

 first of all I'd like to ask you a general question:

 Does somebody successfully set up a clvm cluster with pacemaker and run
 it in productive mode?
>>> I will say yes after I finally resolve remaining dlm&fencing issues.
>>>
 Now back to the concrete problem:

   I configured two interfaces for corosync:

 root@bbzclnode04:~# corosync-cfgtool -s
 Printing ring status.
 Local node ID 897624256
 RING ID 0
  id= 192.168.128.53
  status= ring 0 active with no faults
 RING ID 1
  id= 192.168.129.23
  status= ring 1 active with no faults

 RRD set to passive

 I also made some changes to my cib:

 node bbzclnode04
 node bbzclnode06
 node bbzclnode07
 primitive clvm ocf:lvm2:clvmd \
  params daemon_timeout="30" \
  meta target-role="Started"
>>> Please instruct clvmd to use corosync stack instead of openais (-I
>>> corosync): otherwise it uses LCK service which is not mature and I
>>> observed major problems with it.
>>>
 primitive dlm ocf:pacemaker:controld \
  meta target-role="Started"
 group dlm-clvm dlm clvm
 clone dlm-clvm-clone dlm-clvm \
  meta interleave="true" ordered="true"
 property $id="cib-bootstrap-options" \
  dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
  cluster-infrastructure="openais" \
  expected-quorum-votes="3" \
  no-quorum-policy="ignore" \
  stonith-enabled="false" \
  last-lrm-refresh="1322643084"

 I cleaned and restarted the resources - nothing! :

 crm(live)resource# cleanup dlm-clvm-clone
 Cleaning up dlm:0 on bbzclnode04
 Cleaning up dlm:0 on bbzclnode06
 Cleaning up dlm:0 on bbzclnode07
 Cleaning up clvm:0 on bbzclnode04
 Cleaning up clvm:0 on bbzclnode06
 Cleaning up clvm:0 on bbzclnode07
 Cleaning up dlm:1 on bbzclnode04
 Cleaning up dlm:1 on bbzclnode06
 Cleaning up dlm:1 on bbzclnode07
 Cleaning up clvm:1 on bbzclnode04
 Cleaning up clvm:1 on bbzclnode06
 Cleaning up clvm:1 on bbzclnode07
 Cleaning up dlm:2 on bbzclnode04
 Cleaning up dlm:2 on bbzclnode06
 Cleaning up dlm:2 on bbzclnode07
 Cleaning up clvm:2 on bbzclnode04
 Cleaning up clvm:2 on bbzclnode06
 Cleaning up clvm:2 on bbzclnode07
 Waiting for 19 replies from the CRMd... OK

 crm_mon:

 
 Last updated: Wed Nov 30 10:15:09 2011
 Stack: openais
 Current DC: bbzclnode04 - partition with quorum
 Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
 3 Nodes configured, 3 expected votes
 1 Resources configured.
 

 Online: [ bbzclnode04 bbzclnode06 bbzclnode07 ]


 Failed actions:
  clvm:1_start_0 (node=bbzclnode06, call=11, rc=1, status=complete):
 unknown error
  clvm:0_start_0 (node=bbzclnode04, call=11, rc=1, status=complete):
 unknown error
  clvm:2_start_0 (node=bbzclnode07, call=11, rc=1, status=complete):
 unknown error


 When I look in the log - there is a message which tells me that may be
 another clvm process is already running - but it isn't so.

 "clvmd could not create local socket Another clvmd is probably already
 running"

 Or is it a permission problem - writing to the filesystem? Is there a
 way to get rid of it?
>>> You can try to run it manually under strace. It will show you what
>>> happens.
>>
>>
>> Here we go:
>>
>> root@bbzclnode07:~# strace clvmd -d  -I cororsync
>> execve("/usr/sbin/clvmd", ["clvmd", "-d", "-I", "cororsync"], [/* 18
>> vars */]) = 0
>> brk(0)  = 0x12f7000
>> access("/etc/ld.so.nohwcap", F_OK)  = -1 ENOENT (No such file or
>> directory)
>> mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
>> 0) = 0x7f9f09dad000
>> access("/etc/ld.so.preload", R_OK)  = -1 ENOENT (No such file or
>> directory)
>> open("/etc/ld.so.cache", O_RDONLY)  = 3
>> fstat(3, {st_mode=S_IFREG|0644, st_size=25864, ...}) = 0
>> mmap(NULL, 25864, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f9f09da6000
>> close(3)= 0
>> access("/etc/ld.so.nohwcap", F_OK)  = -1 ENOENT (No such file or
>> directory)
>> open("/lib/x86_64-linux-gnu/libdl.so.2", O_RDONLY) 

Re: [Pacemaker] lrmd hanging

2011-11-30 Thread coredump
Hey Dejan, 2 questions:

1) My test environment was virtual, but using the same versions than
the server, and it worked.
2) Can you point me to a bug report about this Glib bug?

On Wed, Nov 30, 2011 at 12:30, Dejan Muhamedagic  wrote:
> Hi,
>
> On Wed, Nov 30, 2011 at 11:16:40AM -0200, coredump wrote:
>> So last night I was supposed to get a cluster running, everything
>> worked ok on a virtual environment using the same software and by my
>> experience I only had to install pacemaker and corosync (from the
>> ubuntu 10.04 ppa) and get it rolling. What really happened was: I
>> could use crm configure to set properties to the cluster like resource
>> stickiness and quorum and disable stonith. When I tried to add
>> primitives, the crm just hang there, without returning an error or
>> completing.
>> I noticed those two entries in the log, everytime crm tries to
>> configure something the first time:
>>
>> Nov 30 05:33:26 server lrmd: [18102]: debug: on_msg_register:client
>> lrmadmin [18159] registered
>> Nov 30 05:33:26 server lrmd: [18102]: debug: on_receive_cmd: the IPC
>> to client [pid:18159] disconnected.
>>
>> Also, when I stop corosync it sends a TERM signal for lrmd but it
>> doesn't exit, even after some minutes, I have to kill -9 it. I tried
>> to strace lrmd but it's stuck on a FUTEX that really doesn't really
>> help a lot:
>>
>> Process 32764 attached - interrupt to quit
>> futex(0xe070d8, FUTEX_WAIT_PRIVATE, 2, NULL^C 
>>
>> Anyone has any idea what would make lrmd to just hang?
>
> It's probably support for the ubuntu specific init system. That
> bug (in glib) has been fixed but I don't know if there are fixed
> packages. Though cluster-glue apparently doesn't need to be
> updated, only glib. Best to open a bug report with ubuntu.
>
> Thanks,
>
> Dejan
>
>> []s
>> core
>>
>> ___
>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] lrmd hanging

2011-11-30 Thread Ante Karamatic
On 30.11.2011 14:16, coredump wrote:

> So last night I was supposed to get a cluster running, everything
> worked ok on a virtual environment using the same software and by my
> experience I only had to install pacemaker and corosync (from the
> ubuntu 10.04 ppa) and get it rolling. What really happened was: I

Which PPA are you using? This one:

https://launchpad.net/~ubuntu-ha-maintainers/+archive/ppa/

has everything you need for 10.04, including glib and rhcs fixes.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] CLVM & Pacemaker & Corosync on Ubuntu Omeiric Server

2011-11-30 Thread Ante Karamatic
On 30.11.2011 13:10, Vadim Bulst wrote:

> I created now the directory "/var/run/lvm" . It wasn't there - work for
> the package maintainer.

Hm... That directory is used for file based locking. clvmd shouldn't be
using that. Did you set up cluster locking in /etc/lvm/lvm.conf
(locking_type)?

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] colocation issue with master-slave resources

2011-11-30 Thread Patrick H.

Sent: Mon Nov 28 2011 16:10:01 GMT-0700 (MST)
From: Patrick H. 
To: The Pacemaker cluster resource manager 
 Andreas Kurz 

Subject: Re: [Pacemaker] colocation issue with master-slave resources

Sent: Mon Nov 28 2011 15:27:10 GMT-0700 (MST)
From: Andrew Beekhof 
To: The Pacemaker cluster resource manager 
 Andreas Kurz 

Subject: Re: [Pacemaker] colocation issue with master-slave resources

Perhaps try and ordering constraint, I may have also fixed something
in this area for 1.1.6 so an upgrade might also help

On Tue, Nov 29, 2011 at 1:38 AM, Patrick H.  
wrote:

Sent: Mon Nov 28 2011 01:31:22 GMT-0700 (MST)
From: Andreas Kurz
To: The Pacemaker cluster resource 
manager

Subject: Re: [Pacemaker] colocation issue with master-slave resources

On 11/28/2011 04:51 AM, Patrick H. wrote:

I'm trying to setup a colocation rule so that a couple of master-slave
resources cant be master unless another resource is running on the same
node, and am getting the exact opposite of what I want. The 
master-slave

resources are getting promoted to master on the node which this other
resource isnt running on.

In the below example, 'stateful1:Master' and 'stateful2:Master' should
be on the same node 'dummy' is on. It works just fine if I change the
colocation around so that 'dummy' depends on the stateful resources
being master, but I dont want that. I want dummy to be able to run no
matter what, but the stateful resources not be able to become master
without dummy.


# crm status

Last updated: Mon Nov 28 03:47:04 2011
Stack: cman
Current DC: devlvs03 - partition with quorum
Version: 1.1.5-5.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
6 Resources configured.


Online: [ devlvs04 devlvs03 ]

  dummy(ocf::pacemaker:Dummy):Started devlvs03
  Master/Slave Set: stateful1-ms [stateful1]
  Masters: [ devlvs04 ]
  Slaves: [ devlvs03 ]
  Master/Slave Set: stateful2-ms [stateful2]
  Masters: [ devlvs04 ]
  Slaves: [ devlvs03 ]


# crm configure show
node devlvs03 \
 attributes standby="off"
node devlvs04 \
 attributes standby="off"
primitive dummy ocf:pacemaker:Dummy \
 meta target-role="Started"
primitive stateful1 ocf:pacemaker:Stateful
primitive stateful2 ocf:pacemaker:Stateful
ms stateful1-ms stateful1
ms stateful2-ms stateful2
colocation stateful1-colocation inf: stateful1-ms:Master dummy
colocation stateful2-colocation inf: stateful2-ms:Master dummy

use dummy:Started ... default is to use same role as left resource, and
Dummy will never be in role Master ...

Regards,
Andreas

Tried that too (just not the configuration at the time I sent the 
email), no

effect.


Upgraded to 1.1.6 and put in an ordering constraint, still no joy.

# crm status

Last updated: Mon Nov 28 23:09:37 2011
Last change: Mon Nov 28 23:08:34 2011 via cibadmin on devlvs03
Stack: cman
Current DC: devlvs03 - partition with quorum
Version: 1.1.6-1.el6-b379478e0a66af52708f56d0302f50b6f13322bd
2 Nodes configured, 2 expected votes
5 Resources configured.


Online: [ devlvs04 devlvs03 ]

 dummy(ocf::pacemaker:Dummy):Started devlvs03
 Master/Slave Set: stateful1-ms [stateful1]
 Masters: [ devlvs04 ]
 Slaves: [ devlvs03 ]
 Master/Slave Set: stateful2-ms [stateful2]
 Masters: [ devlvs04 ]
 Slaves: [ devlvs03 ]


# crm configure show
node devlvs03 \
attributes standby="off"
node devlvs04 \
attributes standby="off"
primitive dummy ocf:pacemaker:Dummy \
meta target-role="Started"
primitive stateful1 ocf:pacemaker:Stateful
primitive stateful2 ocf:pacemaker:Stateful
ms stateful1-ms stateful1
ms stateful2-ms stateful2
colocation stateful1-colocation inf: stateful1-ms:Master dummy:Started
colocation stateful2-colocation inf: stateful2-ms:Master dummy:Started
order stateful1-start inf: dummy:start stateful1-ms:promote
order stateful2-start inf: dummy:start stateful2-ms:promote
property $id="cib-bootstrap-options" \
dc-version="1.1.6-1.el6-b379478e0a66af52708f56d0302f50b6f13322bd" \
cluster-infrastructure="cman" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore" \
last-lrm-refresh="1322450542"
Well there is a really ugly workaround that solves this. If I convert 
'dummy' to a master-slave resource, and just have the slave do nothing. 
It does obey the colocation rule when I tell it to keep the Master roles 
on the same box.
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] CLVM & Pacemaker & Corosync on Ubuntu Omeiric Server

2011-11-30 Thread Vladislav Bogdanov
30.11.2011 19:27, Ante Karamatic wrote:
> On 30.11.2011 13:10, Vadim Bulst wrote:
> 
>> I created now the directory "/var/run/lvm" . It wasn't there - work for
>> the package maintainer.
> 
> Hm... That directory is used for file based locking. clvmd shouldn't be
> using that. Did you set up cluster locking in /etc/lvm/lvm.conf
> (locking_type)?

bind(3, {sa_family=AF_FILE, path="/var/run/lvm/clvmd.sock"}, 110)
It tries to create unix socket there.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] lrmd hanging

2011-11-30 Thread coredump
It turned out to be the libglib bug, fixed with the packages from the ppa.

Thanks!

On Wed, Nov 30, 2011 at 14:13, Ante Karamatic  wrote:
> On 30.11.2011 14:16, coredump wrote:
>
>> So last night I was supposed to get a cluster running, everything
>> worked ok on a virtual environment using the same software and by my
>> experience I only had to install pacemaker and corosync (from the
>> ubuntu 10.04 ppa) and get it rolling. What really happened was: I
>
> Which PPA are you using? This one:
>
> https://launchpad.net/~ubuntu-ha-maintainers/+archive/ppa/
>
> has everything you need for 10.04, including glib and rhcs fixes.
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] [PATCH]Build error of pacemaker-1.0.12

2011-11-30 Thread nozawat
Hi

 I overlooked the following errors in build of pacemaker-1.0.12.
--
cc1: warnings being treated as errors
remote.c: In function 'create_tls_session':
remote.c:85: warning: passing argument 1 of 'gnutls_dh_set_prime_bits'
from incompatible pointer type
gmake[2]: *** [remote.lo] Error 1
--
 I send the patch for the error mentioned above.

Regards,
Tomo


remote.c.patch
Description: Binary data
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] How to build Pacemaker with Cman support?

2011-11-30 Thread Богомолов Дмитрий Викторович
I Solved the problem.
Need to install libfense-dev.
After that ./configure script includes needed directives in Makefile.
And now i am doing experiments with active/active configuration.

Thanks, anyway, for your attention and advises.

30 ноября 2011, 13:37 от Богомолов Дмитрий Викторович :
> Hello.
> 29 ноября 2011, 02:24 от Andrew Beekhof :
> > 2011/11/28 Богомолов Дмитрий Викторович :
> > > Thanks for your reply!
> > >
> > >
> > > 28 ноября 2011, 03:54 от Andrew Beekhof :
> > >> 2011/11/28 Богомолов Дмитрий Викторович :
> > >> > Hello.
> > >> > Addition. OS - Ubuntu 11.10
> > >> > I have  installed libcman-dev, and know in config.log I can see
> > >>
> > >> I'm pretty sure the builds of pacemaker that come with ubuntu support
> > >> cman already.
> > > No it's not.
> > > I have tried to upgrade from distributives: oneiric-proposed, 
> > > ppa.launchpad.net/ubuntu-ha,
> > > ppa.launchpad.net/ubuntu-ha-maintainers
> > > There is no luck.
> > > I post about it on ubuntu communiti forum, there is no answer.
> > > http://ubuntuforums.org/showthread.php?t=1885340
> > > And i found bug report log without answer
> > > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=639548
> > >
> > > That's why i trying now to build pacemaker from sources.
> > >
> > > I selected ubuntu because of simplicity and oneiric distr because of most 
> > > recent.
> > >
> > > I want to get Xen VM on cluster, I have tried active/passive 
> > > configuration, but it's not exactly what i need. So know i try to get 
> > > active active configuration.
> > >>
> > >> >
> > >> > configure:16634: checking for cman
> > >> >
> > >> > configure:16638: result: yes
> >
> > Ok, but you originally posted:
> >
> > configure:16634: checking for cman
> >
> > configure:16638: result: no
> >
> > So maybe something changed?
> Yes. And i wrote about it.
> First i tried to build this way:
> aptitude build-dep pacemaker
> apt-get source pacemaker
> ./autogen.sh
> ./configure --enable-fatal-warnings=no --with-cman 
> --with-lcrso-dir=/usr/libexec/lcrso --prefix=/usr
> ./make
> ./make install
> this way i get in config.log:
> configure:16634: checking for cman
> configure:16638: result: no
> 
> then i install libcman-dev,
> ./make clean
> ./autogen.sh
> ./configure --enable-fatal-warnings=no --with-cman 
> --with-lcrso-dir=/usr/libexec/lcrso --prefix=/usr
> ./make
> ./make install
> And now i get
> configure:16634: checking for cman
> configure:16638: result: yes
> 
> But, when i succesfully start cman, then start pacemaker, which failed to 
> start, i get:
> ERROR: read_config: Corosync configured for CMAN but this build of Pacemaker 
> doesn't support it
> 
> >
> > >> >
> > >> > But, after :
> > >> > make && make install
> > >> > service pacemaker start
> > >> > I still get this log event:
> > >> > ERROR: read_config: Corosync configured for CMAN but this build of 
> > >> > Pacemaker
> > >> > doesn't support it
> > >> > Please, help!
> > >> >
> > >> > Hello.
> > >> >
> > >> > I try to configure Active/Active cluster Cman+Pacemaker, that described
> > >> > there:
> > >> > http://www.clusterlabs.org/doc/en-US..._from_Scratch/
> > >> > I set Cman, but when I start Pacemaker with this command:
> > >> > $sudo service pacemaker start
> > >> > I get this log event:
> > >> > ERROR: read_config: Corosync configured for CMAN but this build of 
> > >> > Pacemaker
> > >> > doesn't support it
> > >> >
> > >> > Now I try to build Pacemaker with Cman.
> > >> >
> > >> > I follow instructions there http://www.clusterlabs.org/wiki/Install
> > >> >
> > >> > only difference for configuring Pacemaker:
> > >> >
> > >> > ./autogen.sh && ./configure --prefix=$PREFIX --with-lcrso-dir=$LCRSODIR
> > >> > -with-cman=yes
> > >> >
> > >> > But after installing pacemaker, I have the same error.
> > >> >
> > >> > When I look on config.log, I can see this:
> > >> >
> > >> > configure:16634: checking for cman
> > >> >
> > >> > configure:16638: result: no
> > >> >
> > >> > So, help please, how to build pacemaker with cman support?
> > >> >
> > >> > ___
> > >> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > >> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > >> >
> > >> > Project Home: http://www.clusterlabs.org
> > >> > Getting started: 
> > >> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > >> > Bugs: http://bugs.clusterlabs.org
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > ___
> > >> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > >> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > >> >
> > >> > Project Home: http://www.clusterlabs.org
> > >> > Getting started: 
> > >> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > >> > Bugs: http://bugs.clusterlabs.org
> > >> >
> > >> >
> > >>
> > >
> >
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mai