Re: [Pacemaker] Corosync 1.4.7: zombie (defunct)

2015-01-04 Thread Sergey Arlashin
Pacemaker 1.1.6

It runs on Ubuntu 12.04 LTS 64bit. 

Linux lb-node1 3.11.0-23-generic #40~precise1-Ubuntu SMP Wed Jun 4 22:06:36 UTC 
2014 x86_64 x86_64 x86_64 GNU/Linux

--
Best regards,
Sergey Arlashin
 

On Jan 5, 2015, at 7:59 AM, Andrew Beekhof  wrote:

> pacemaker version?  it looks familiar but it depends on the version number.
> 
>> On 29 Dec 2014, at 10:24 pm, Sergey Arlashin  
>> wrote:
>> 
>> Hi!
>> Recently I've noticed that one of my nodes had OFFLINE status in 'crm 
>> status' output. But it actually was not. I could ssh on this node. I could 
>> get 'crm status' from that node's console. After some time it became online. 
>> It happened several times without any obvious reason with other nodes. 
>> 
>> Still no error of fatal messages in logs. The only warning messages I could 
>> get from corosync.log were the following:
>> 
>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 
>> 0.233.1346 -> 0.233.1347 not applied to 0.233.1354: current "num_updates" is 
>> greater than required
>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 
>> 0.233.1347 -> 0.233.1348 not applied to 0.233.1354: current "num_updates" is 
>> greater than required
>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 
>> 0.233.1348 -> 0.233.1349 not applied to 0.233.1354: current "num_updates" is 
>> greater than required
>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 
>> 0.233.1349 -> 0.233.1350 not applied to 0.233.1354: current "num_updates" is 
>> greater than required
>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 
>> 0.233.1350 -> 0.233.1351 not applied to 0.233.1354: current "num_updates" is 
>> greater than required
>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 
>> 0.233.1351 -> 0.233.1352 not applied to 0.233.1354: current "num_updates" is 
>> greater than required
>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 
>> 0.233.1352 -> 0.233.1353 not applied to 0.233.1354: current "num_updates" is 
>> greater than required
>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 
>> 0.233.1353 -> 0.233.1354 not applied to 0.233.1354: current "num_updates" is 
>> greater than required
>> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: Update 491 
>> for last-failure-Cachier=1419729443 failed: Application of an update diff 
>> failed
>> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: Update 494 
>> for fail-count-Cachier=1 failed: Application of an update diff failed
>> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: Update 497 
>> for probe_complete=true failed: Application of an update diff failed
>> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: Update 500 
>> for last-failure-Cachier=1419729443 failed: Application of an update diff 
>> failed
>> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: Update 503 
>> for fail-count-Cachier=1 failed: Application of an update diff failed
>> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 
>> 0.233.1338 -> 0.233.1339 not applied to 0.233.1382: current "num_updates" is 
>> greater than required
>> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 
>> 0.233.1339 -> 0.233.1340 not applied to 0.233.1382: current "num_updates" is 
>> greater than required
>> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 
>> 0.233.1340 -> 0.233.1341 not applied to 0.233.1382: current "num_updates" is 
>> greater than required
>> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 
>> 0.233.1341 -> 0.233.1342 not applied to 0.233.1382: current "num_updates" is 
>> greater than required
>> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 
>> 0.233.1342 -> 0.233.1343 not applied to 0.233.1382: current "num_updates" is 
>> greater than required
>> 
>> After exploring corosync processes with ps I found out that on all my nodes 
>> there are zombie corosync procs like:
>> 
>> root 13892  0.0  0.0  0 0 ?ZDec26   0:04 [corosync] 
>> 
>> root 21793  0.0  0.0  0 0 ?ZDec26   0:00 [corosync] 
>> 
>> root 27009  1.3  1.0 714292 10784 ?Ssl  Dec18 223:38 
>> /usr/sbin/corosync
>> 
>> Is it ok to have zombie corosync procs on nodes? Or does it suggest that 
>> something wrong is going on ? 
>> 
>> Thanks in advance
>> 
>> --
>> Best regards,
>> Sergey Arlashin
>> 
>> 
>> 
>> 
>> 
>> ___
>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.

Re: [Pacemaker] Corosync 1.4.7: zombie (defunct)

2015-01-04 Thread Andrew Beekhof
pacemaker version?  it looks familiar but it depends on the version number.

> On 29 Dec 2014, at 10:24 pm, Sergey Arlashin  
> wrote:
> 
> Hi!
> Recently I've noticed that one of my nodes had OFFLINE status in 'crm status' 
> output. But it actually was not. I could ssh on this node. I could get 'crm 
> status' from that node's console. After some time it became online. It 
> happened several times without any obvious reason with other nodes. 
> 
> Still no error of fatal messages in logs. The only warning messages I could 
> get from corosync.log were the following:
> 
> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 0.233.1346 
> -> 0.233.1347 not applied to 0.233.1354: current "num_updates" is greater 
> than required
> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 0.233.1347 
> -> 0.233.1348 not applied to 0.233.1354: current "num_updates" is greater 
> than required
> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 0.233.1348 
> -> 0.233.1349 not applied to 0.233.1354: current "num_updates" is greater 
> than required
> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 0.233.1349 
> -> 0.233.1350 not applied to 0.233.1354: current "num_updates" is greater 
> than required
> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 0.233.1350 
> -> 0.233.1351 not applied to 0.233.1354: current "num_updates" is greater 
> than required
> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 0.233.1351 
> -> 0.233.1352 not applied to 0.233.1354: current "num_updates" is greater 
> than required
> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 0.233.1352 
> -> 0.233.1353 not applied to 0.233.1354: current "num_updates" is greater 
> than required
> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 0.233.1353 
> -> 0.233.1354 not applied to 0.233.1354: current "num_updates" is greater 
> than required
> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: Update 491 
> for last-failure-Cachier=1419729443 failed: Application of an update diff 
> failed
> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: Update 494 
> for fail-count-Cachier=1 failed: Application of an update diff failed
> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: Update 497 
> for probe_complete=true failed: Application of an update diff failed
> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: Update 500 
> for last-failure-Cachier=1419729443 failed: Application of an update diff 
> failed
> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: Update 503 
> for fail-count-Cachier=1 failed: Application of an update diff failed
> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 0.233.1338 
> -> 0.233.1339 not applied to 0.233.1382: current "num_updates" is greater 
> than required
> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 0.233.1339 
> -> 0.233.1340 not applied to 0.233.1382: current "num_updates" is greater 
> than required
> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 0.233.1340 
> -> 0.233.1341 not applied to 0.233.1382: current "num_updates" is greater 
> than required
> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 0.233.1341 
> -> 0.233.1342 not applied to 0.233.1382: current "num_updates" is greater 
> than required
> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff 0.233.1342 
> -> 0.233.1343 not applied to 0.233.1382: current "num_updates" is greater 
> than required
> 
> After exploring corosync processes with ps I found out that on all my nodes 
> there are zombie corosync procs like:
> 
> root 13892  0.0  0.0  0 0 ?ZDec26   0:04 [corosync] 
> 
> root 21793  0.0  0.0  0 0 ?ZDec26   0:00 [corosync] 
> 
> root 27009  1.3  1.0 714292 10784 ?Ssl  Dec18 223:38 
> /usr/sbin/corosync
> 
> Is it ok to have zombie corosync procs on nodes? Or does it suggest that 
> something wrong is going on ? 
> 
> Thanks in advance
> 
> --
> Best regards,
> Sergey Arlashin
> 
> 
> 
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Pacemaker + drbd + Cman Error: gfs_controld join connect error: Connection refused error mounting lockproto lock_dlm

2015-01-04 Thread Digimer
You've disabled stonith, which alone is a very bad idea with DRBD and 
cman. Please enable it, configure and test stonith devices, and then 
hook DRBD into pacemaker using the 'fence-handler 
'/path/to/crm-fence-peer.sh' and set 'fencing resource-and-stonith'. 
Then configure cman to hook into pacemaker's fencing using the 
fence_pcmk fence handler in cluster.conf.


On 04/01/15 03:01 PM, raby wrote:

Hi this is the drbdadm dump

/# resource pcmAppData on pcmk-1: not ignored, not stacked
# defined at /etc/drbd.conf:10
resource pcmAppData {
 on pcmk-1 {
 device   /dev/drbd1 minor 1;
 disk /dev/mapper/VolGroup-drbd--demo;
 meta-diskinternal;
 address  ipv4 192.168.203.128:7789
;
 }
 on pcmk-2 {
 device   /dev/drbd1 minor 1;
 disk /dev/mapper/VolGroup-drbd--demo;
 meta-diskinternal;
 address  ipv4 192.168.203.130:7789
;
 }
 net {
 verify-alg   sha1;
 allow-two-primaries yes;
 }
}/

and this is the pacemaker configuration

/node pcmk-1 \
 attributes standby="off"
node pcmk-2 \
 attributes standby="off"
primitive ClusterIP ocf:heartbeat:IPaddr2 \
 params ip="192.168.203.150" cidr_netmask="32" \
 op monitor interval="30s"
primitive PcmData ocf:linbit:drbd \
 params drbd_resource="pcmAppData" \
 op monitor interval="60s"
primitive PcmFS ocf:heartbeat:Filesystem \
 params device="/dev/drbd/by-res/pcmAppData"
directory="/home/hassan/logs" fstype="gfs2"
primitive PcmtestApp ocf:raby:RabyAgent \
 op monitor interval="100ms" \
 meta target-role="Started" migration-threshold="1" failure-timeout="1"
ms PcmDataClone PcmData \
 meta master-max="2" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true" target-role="Started"
clone ClusterIPClone ClusterIP \
 meta golbally-unique="true" clone-max="2" clone-node-max="2"
clone PcmFSClone PcmFS
clone PcmtestAppClone PcmtestApp
colocation PcmtestApp-with-PcmFS inf: PcmtestAppClone PcmFSClone
colocation PcmtestApp-with-ip inf: PcmtestAppClone ClusterIPClone
colocation fs_on_drbd inf: PcmFSClone PcmDataClone:Master
order PcmFS-after-PcmData inf: PcmDataClone:promote PcmFSClone:start
order PcmtestApp-after-PcmFS inf: PcmFSClone PcmtestAppClone
property $id="cib-bootstrap-options" \
 dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
 cluster-infrastructure="cman" \
 expected-quorum-votes="2" \
 stonith-enabled="false" \
 no-quorum-policy="ignore" \
 last-lrm-refresh="1417122793"/


As for the update, i unfortunately cannot to upgrade to Ubuntu-14 +
pacemaker 1.1


2015-01-02 14:48 GMT-08:00 Digimer mailto:li...@alteeve.ca>>:

Pacemaker 1.1.6 is very, very old. If you can upgrade to Ubuntu
14.04 LTS you will get 1.1.10 (or newer). A *lot* changed in the
meantime, plus a lot of bug fixes.

Can you upgrade?

If you still have trouble, please reply with your pacemaker
configuration as well, and the 'drbdadm dump' output wouldn't hurt.

On 02/01/15 05:39 PM, raby wrote:

Hi i am following the cluster from scratch guide to set up an active
active drbd configuration. here is my configuration
Ubuntu 12 kernel 3.11.0-19-generic
pacemaker 1.1.6
corosync 1.4.2
cman 3.1.7
gfs2 3.1.3

I am trying to do
*mount /dev/drbd1*
and i get this message
/gfs_controld join connect error: Connection refused
error mounting lockproto lock_dlm/

here is the corosync configuration:
/# Please read the corosync.conf.5 manual page

compatibility: whitetank

totem {
  version: 2
  secauth: off
  threads: 0
  interface {

  member {
  memberaddr: 192.168.203.128
  }
  member {
  memberaddr: 192.168.203.130
  }


  ringnumber: 0
bindnetaddr: 192.168.203.0
mcastaddr: 239.255.1.1
mcastport: 4000
  ttl: 1
  }
}

logging {
  fileline: off
  to_stderr: no
  to_logfile: yes
  to_syslog: yes
  logfile: /var/log/cluster/corosync.log
  debug: off
  timestamp: on
  logger_subsys {
  subsys: AMF
  debug: off
  }
}

amf {
  mode: disabled
}


service {

   # Load the Pacemaker Cluster Resource Manager
   ver: 1
   name: pacemaker
   }
   aisexec {
   user: root
   group: root
   }/

The cluster.conf configuration
 

Re: [Pacemaker] Pacemaker + drbd + Cman Error: gfs_controld join connect error: Connection refused error mounting lockproto lock_dlm

2015-01-04 Thread raby
Hi this is the drbdadm dump




















*# resource pcmAppData on pcmk-1: not ignored, not stacked# defined at
/etc/drbd.conf:10resource pcmAppData {on pcmk-1 {
device   /dev/drbd1 minor 1;disk
/dev/mapper/VolGroup-drbd--demo;meta-diskinternal;
address  ipv4 192.168.203.128:7789
;}on pcmk-2 {device
/dev/drbd1 minor 1;disk
/dev/mapper/VolGroup-drbd--demo;meta-diskinternal;
address  ipv4 192.168.203.130:7789
;}net {verify-alg
sha1;allow-two-primaries yes;}}*

and this is the pacemaker configuration

































*node pcmk-1 \attributes standby="off"node pcmk-2 \attributes
standby="off"primitive ClusterIP ocf:heartbeat:IPaddr2 \params
ip="192.168.203.150" cidr_netmask="32" \op monitor
interval="30s"primitive PcmData ocf:linbit:drbd \params
drbd_resource="pcmAppData" \op monitor interval="60s"primitive PcmFS
ocf:heartbeat:Filesystem \params device="/dev/drbd/by-res/pcmAppData"
directory="/home/hassan/logs" fstype="gfs2"primitive PcmtestApp
ocf:raby:RabyAgent \op monitor interval="100ms" \meta
target-role="Started" migration-threshold="1" failure-timeout="1"ms
PcmDataClone PcmData \meta master-max="2" master-node-max="1"
clone-max="2" clone-node-max="1" notify="true" target-role="Started"clone
ClusterIPClone ClusterIP \meta golbally-unique="true" clone-max="2"
clone-node-max="2"clone PcmFSClone PcmFSclone PcmtestAppClone
PcmtestAppcolocation PcmtestApp-with-PcmFS inf: PcmtestAppClone
PcmFSClonecolocation PcmtestApp-with-ip inf: PcmtestAppClone
ClusterIPClonecolocation fs_on_drbd inf: PcmFSClone
PcmDataClone:Masterorder PcmFS-after-PcmData inf: PcmDataClone:promote
PcmFSClone:startorder PcmtestApp-after-PcmFS inf: PcmFSClone
PcmtestAppCloneproperty $id="cib-bootstrap-options" \
dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
cluster-infrastructure="cman" \expected-quorum-votes="2" \
stonith-enabled="false" \no-quorum-policy="ignore" \
last-lrm-refresh="1417122793"*


As for the update, i unfortunately cannot to upgrade to Ubuntu-14 +
pacemaker 1.1


2015-01-02 14:48 GMT-08:00 Digimer :

> Pacemaker 1.1.6 is very, very old. If you can upgrade to Ubuntu 14.04 LTS
> you will get 1.1.10 (or newer). A *lot* changed in the meantime, plus a lot
> of bug fixes.
>
> Can you upgrade?
>
> If you still have trouble, please reply with your pacemaker configuration
> as well, and the 'drbdadm dump' output wouldn't hurt.
>
> On 02/01/15 05:39 PM, raby wrote:
>
>> Hi i am following the cluster from scratch guide to set up an active
>> active drbd configuration. here is my configuration
>> Ubuntu 12 kernel 3.11.0-19-generic
>> pacemaker 1.1.6
>> corosync 1.4.2
>> cman 3.1.7
>> gfs2 3.1.3
>>
>> I am trying to do
>> *mount /dev/drbd1*
>> and i get this message
>> /gfs_controld join connect error: Connection refused
>> error mounting lockproto lock_dlm/
>>
>> here is the corosync configuration:
>> /# Please read the corosync.conf.5 manual page
>>
>> compatibility: whitetank
>>
>> totem {
>>  version: 2
>>  secauth: off
>>  threads: 0
>>  interface {
>>
>>  member {
>>  memberaddr: 192.168.203.128
>>  }
>>  member {
>>  memberaddr: 192.168.203.130
>>  }
>>
>>
>>  ringnumber: 0
>> bindnetaddr: 192.168.203.0
>> mcastaddr: 239.255.1.1
>> mcastport: 4000
>>  ttl: 1
>>  }
>> }
>>
>> logging {
>>  fileline: off
>>  to_stderr: no
>>  to_logfile: yes
>>  to_syslog: yes
>>  logfile: /var/log/cluster/corosync.log
>>  debug: off
>>  timestamp: on
>>  logger_subsys {
>>  subsys: AMF
>>  debug: off
>>  }
>> }
>>
>> amf {
>>  mode: disabled
>> }
>>
>>
>> service {
>>
>>   # Load the Pacemaker Cluster Resource Manager
>>   ver: 1
>>   name: pacemaker
>>   }
>>   aisexec {
>>   user: root
>>   group: root
>>   }/
>>
>> The cluster.conf configuration
>> /
>> 
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>> /
>>
>> On the previous threads i have seen that you have to check how dlm is
>> managed but i have no idea how to check that, ant help ? thanks.
>>
>>
>> ___
>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>>
>
> --
> Digimer
> Papers and Projects: ht

[Pacemaker] Problems with SBD

2015-01-04 Thread Oriol Mula-Valls
Hi everyone,

I have a two node system with SLES 11 SP3 (pacemaker-1.1.9-0.19.102,
corosync-1.4.5-0.18.15, sbd-1.1-0.13.153). Since desember we started to
have several reboots of the system due to SBD; 22nd, 24th and 26th. Last
reboot happened yesterday January 3rd. The message is the same all the
times.
/var/log/messages:Jan  3 11:55:08 kernighan sbd: [7879]: info: Cancelling
IO request due to timeout (rw=0)
/var/log/messages:Jan  3 11:55:08 kernighan sbd: [7879]: ERROR: mbox read
failed in servant.
/var/log/messages:Jan  3 11:55:08 kernighan sbd: [7878]: WARN: Servant for
/dev/sdc1 (pid: 7879) has terminated
/var/log/messages:Jan  3 11:55:08 kernighan sbd: [7878]: WARN: Servant for
/dev/sdc1 outdated (age: 4)
/var/log/messages:Jan  3 11:55:08 kernighan sbd: [8183]: info: Servant
starting for device /dev/sdc1
/var/log/messages:Jan  3 11:55:11 kernighan sbd: [8183]: info: Cancelling
IO request due to timeout (rw=0)
/var/log/messages:Jan  3 11:55:11 kernighan sbd: [8183]: ERROR: Unable to
read header from device 5
/var/log/messages:Jan  3 11:55:11 kernighan sbd: [8183]: ERROR: Not a valid
header on /dev/sdc1
/var/log/messages:Jan  3 11:55:11 kernighan sbd: [7878]: WARN: Servant for
/dev/sdc1 (pid: 8183) has terminated
/var/log/messages:Jan  3 11:55:11 kernighan sbd: [7878]: WARN: Latency: No
liveness for 4 s exceeds threshold of 3 s (healthy servants: 0)

The sbd is an iscsi drive shared by synology box.

Could any one provide me some guidance on what's happenning please?

Thanks in advance,

Oriol
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org