Re: [Pacemaker] Two node KVM cluster

2013-05-01 Thread Oriol Mula-Valls

On 01/05/13 06:12, Andrew Beekhof wrote:


On 28/04/2013, at 9:19 PM, Oriol Mula-Valls  wrote:


Hi,

I have modified the previous configuration to use sbd fencing. I have also 
fixed several other issues with the configuration and now when the node reboots 
seems not to be able to rejoin the cluster.

I attach the debug log I have just generated. Node was rebooted around 11:51:41 
and came back at 12:52:47.

The boot order of the services is:
1. sbd
2. corosync
3. pacemaker


It doesn't look like pacemaker was restarted on node1, just corosync.


The node was forcedly rebooted with echo b > /proc/syrq-trigger. I am 
still testing what will happen in case of an unexpected reboot.






Could someone help me, please?

Thanks,
Oriol

On 16/04/13 06:10, Andrew Beekhof wrote:


On 10/04/2013, at 3:20 PM, Oriol Mula-Valls   wrote:


On 10/04/13 02:10, Andrew Beekhof wrote:


On 09/04/2013, at 7:31 PM, Oriol Mula-Vallswrote:


Thanks Andrew I've managed to set up the system and currently I have it working 
but still on testing.

I have configure external/ipmi as fencing device and then I force a reboot doing a 
echo b>/proc/sysrq-trigger. The fencing is working properly as the node is 
shut off and the VM migrated. However, as soon as I turn on the fenced now and the 
OS has started the surviving is shut down. Is it normal or am I doing something 
wrong?


Can you clarify "turn on the fenced"?



To restart the fenced node I do either a power on with ipmitool or I power it 
on using the iRMC web interface.


Oh, "fenced now" was meant to be "fenced node".  That makes more sense now :)

To answer your question, I would not expect the surviving node to be fenced 
when the previous node returns.
The network between the two is still functional?




--
Oriol Mula Valls
Institut Català de Ciències del Clima (IC3)
Doctor Trueta 203 - 08005 Barcelona
Tel:+34 93 567 99 77
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




--
Oriol Mula Valls
Institut Català de Ciències del Clima (IC3)
Doctor Trueta 203 - 08005 Barcelona
Tel:+34 93 567 99 77

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Two node KVM cluster

2013-04-30 Thread Andrew Beekhof

On 28/04/2013, at 9:19 PM, Oriol Mula-Valls  wrote:

> Hi,
> 
> I have modified the previous configuration to use sbd fencing. I have also 
> fixed several other issues with the configuration and now when the node 
> reboots seems not to be able to rejoin the cluster.
> 
> I attach the debug log I have just generated. Node was rebooted around 
> 11:51:41 and came back at 12:52:47.
> 
> The boot order of the services is:
> 1. sbd
> 2. corosync
> 3. pacemaker

It doesn't look like pacemaker was restarted on node1, just corosync.

> 
> Could someone help me, please?
> 
> Thanks,
> Oriol
> 
> On 16/04/13 06:10, Andrew Beekhof wrote:
>> 
>> On 10/04/2013, at 3:20 PM, Oriol Mula-Valls  wrote:
>> 
>>> On 10/04/13 02:10, Andrew Beekhof wrote:
 
 On 09/04/2013, at 7:31 PM, Oriol Mula-Valls   
 wrote:
 
> Thanks Andrew I've managed to set up the system and currently I have it 
> working but still on testing.
> 
> I have configure external/ipmi as fencing device and then I force a 
> reboot doing a echo b>   /proc/sysrq-trigger. The fencing is working 
> properly as the node is shut off and the VM migrated. However, as soon as 
> I turn on the fenced now and the OS has started the surviving is shut 
> down. Is it normal or am I doing something wrong?
 
 Can you clarify "turn on the fenced"?
 
>>> 
>>> To restart the fenced node I do either a power on with ipmitool or I power 
>>> it on using the iRMC web interface.
>> 
>> Oh, "fenced now" was meant to be "fenced node".  That makes more sense now :)
>> 
>> To answer your question, I would not expect the surviving node to be fenced 
>> when the previous node returns.
>> The network between the two is still functional?
>> 
> 
> 
> -- 
> Oriol Mula Valls
> Institut Català de Ciències del Clima (IC3)
> Doctor Trueta 203 - 08005 Barcelona
> Tel:+34 93 567 99 77
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Two node KVM cluster

2013-04-29 Thread Andrew Beekhof

On 17/04/2013, at 4:02 PM, Oriol Mula-Valls  wrote:

> On 16/04/13 06:10, Andrew Beekhof wrote:
>> 
>> On 10/04/2013, at 3:20 PM, Oriol Mula-Valls  wrote:
>> 
>>> On 10/04/13 02:10, Andrew Beekhof wrote:
 
 On 09/04/2013, at 7:31 PM, Oriol Mula-Valls   
 wrote:
 
> Thanks Andrew I've managed to set up the system and currently I have it 
> working but still on testing.
> 
> I have configure external/ipmi as fencing device and then I force a 
> reboot doing a echo b>   /proc/sysrq-trigger. The fencing is working 
> properly as the node is shut off and the VM migrated. However, as soon as 
> I turn on the fenced now and the OS has started the surviving is shut 
> down. Is it normal or am I doing something wrong?
 
 Can you clarify "turn on the fenced"?
 
>>> 
>>> To restart the fenced node I do either a power on with ipmitool or I power 
>>> it on using the iRMC web interface.
>> 
>> Oh, "fenced now" was meant to be "fenced node".  That makes more sense now :)
>> 
>> To answer your question, I would not expect the surviving node to be fenced 
>> when the previous node returns.
>> The network between the two is still functional?
> 
> Sorry I didn't not realised the mistake even  while writing the answer :)
> 
> IPMI network is still working between the nodes.

Ok, but what about the network corosync is using?

> 
> Thanks,
> Oriol
> 
>> 
>>> 
> 
> On the other hand I've seen that in case I completely lose power fencing 
> obviously fails. Would SBD stonith solve this issue?
> 
> Kind regards,
> Oriol
> 
> On 08/04/13 04:11, Andrew Beekhof wrote:
>> 
>> On 03/04/2013, at 9:15 PM, Oriol Mula-Valls
>> wrote:
>> 
>>> Hi,
>>> 
>>> I've started with Linux HA about one year ago. Currently I'm facing a 
>>> new project in which I have to set up two nodes with high available 
>>> virtual machines. I have used as a starting point the Digimer's 
>>> tutorial (https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial).
>>> 
>>> To deploy this new infrastructure I have two Fujitsu Primergy Rx100S7. 
>>> Both machines have 8GB of RAM and 2x500GB HD. I started creating a 
>>> software RAID1 with the internal drives and installing Debian 7.0 
>>> (Wheezy). Apart from the O.S. partition I have created 3 more 
>>> partitions, one for the shared storage between both machines with OCFS2 
>>> and the two other will be used as PVs to create LVs to support the VMs 
>>> (one for the VMs that will be primary on node1 an the other for primary 
>>> machines on node2). These 3 partitions are replicated using DRBD.
>>> 
>>> The shared storage folder contains:
>>> * ISO images needed when provisioning VMs
>>> * scripts used to call virt-install which handles the creation of our 
>>> VMs.
>>> * XML definition files which define the emulated hardware backing the 
>>> VMs
>>> * old copies of the XML definition files.
>>> 
>>> I have more or less done the configuration for the OCFS2 fs and I was 
>>> about to start the configuration of cLVM for one of the VGs but I have 
>>> some doubts. I have one dlm for the OCFS2 filesystem, should I create 
>>> another for cLVM RA?
>> 
>> No, there should only ever be one dlm resource (cloned like you have it)
>> 
>>> 
>>> This is the current configuration:
>>> node node1
>>> node node2
>>> primitive p_dlm_controld ocf:pacemaker:controld \
>>> op start interval="0" timeout="90" \
>>> op stop interval="0" timeout="100" \
>>> op monitor interval="10"
>>> primitive p_drbd_shared ocf:linbit:drbd \
>>> params drbd_resource="shared" \
>>> op monitor interval="10" role="Master" timeout="20" \
>>> op monitor interval="20" role="Slave" timeout="20" \
>>> op start interval="0" timeout="240s" \
>>> op stop interval="0" timeout="120s"
>>> primitive p_drbd_vm_1 ocf:linbit:drbd \
>>> params drbd_resource="vm_1" \
>>> op monitor interval="10" role="Master" timeout="20" \
>>> op monitor interval="20" role="Slave" timeout="20" \
>>> op start interval="0" timeout="240s" \
>>> op stop interval="0" timeout="120s"
>>> primitive p_fs_shared ocf:heartbeat:Filesystem \
>>> params device="/dev/drbd/by-res/shared" directory="/shared" 
>>> fstype="ocfs2" \
>>> meta target-role="Started" \
>>> op monitor interval="10"
>>> primitive p_ipmi_node1 stonith:external/ipmi \
>>> params hostname="node1" userid="admin" passwd="xxx" 
>>> ipaddr="10.0.0.2" interface="lanplus"
>>> primitive p_ipmi_node2 stonith:external/ipmi \
>>> params hostname="node2" userid="admin" passwd="xxx" 
>>> ipaddr="10.0.0.3" interface="lanplus"
>>> primitive p_libvirt

Re: [Pacemaker] Two node KVM cluster

2013-04-16 Thread Oriol Mula-Valls

On 16/04/13 06:10, Andrew Beekhof wrote:


On 10/04/2013, at 3:20 PM, Oriol Mula-Valls  wrote:


On 10/04/13 02:10, Andrew Beekhof wrote:


On 09/04/2013, at 7:31 PM, Oriol Mula-Valls   wrote:


Thanks Andrew I've managed to set up the system and currently I have it working 
but still on testing.

I have configure external/ipmi as fencing device and then I force a reboot doing a 
echo b>   /proc/sysrq-trigger. The fencing is working properly as the node is 
shut off and the VM migrated. However, as soon as I turn on the fenced now and the 
OS has started the surviving is shut down. Is it normal or am I doing something 
wrong?


Can you clarify "turn on the fenced"?



To restart the fenced node I do either a power on with ipmitool or I power it 
on using the iRMC web interface.


Oh, "fenced now" was meant to be "fenced node".  That makes more sense now :)

To answer your question, I would not expect the surviving node to be fenced 
when the previous node returns.
The network between the two is still functional?


Sorry I didn't not realised the mistake even  while writing the answer :)

IPMI network is still working between the nodes.

Thanks,
Oriol







On the other hand I've seen that in case I completely lose power fencing 
obviously fails. Would SBD stonith solve this issue?

Kind regards,
Oriol

On 08/04/13 04:11, Andrew Beekhof wrote:


On 03/04/2013, at 9:15 PM, Oriol Mula-Vallswrote:


Hi,

I've started with Linux HA about one year ago. Currently I'm facing a new 
project in which I have to set up two nodes with high available virtual 
machines. I have used as a starting point the Digimer's tutorial 
(https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial).

To deploy this new infrastructure I have two Fujitsu Primergy Rx100S7. Both 
machines have 8GB of RAM and 2x500GB HD. I started creating a software RAID1 
with the internal drives and installing Debian 7.0 (Wheezy). Apart from the 
O.S. partition I have created 3 more partitions, one for the shared storage 
between both machines with OCFS2 and the two other will be used as PVs to 
create LVs to support the VMs (one for the VMs that will be primary on node1 an 
the other for primary machines on node2). These 3 partitions are replicated 
using DRBD.

The shared storage folder contains:
* ISO images needed when provisioning VMs
* scripts used to call virt-install which handles the creation of our VMs.
* XML definition files which define the emulated hardware backing the VMs
* old copies of the XML definition files.

I have more or less done the configuration for the OCFS2 fs and I was about to 
start the configuration of cLVM for one of the VGs but I have some doubts. I 
have one dlm for the OCFS2 filesystem, should I create another for cLVM RA?


No, there should only ever be one dlm resource (cloned like you have it)



This is the current configuration:
node node1
node node2
primitive p_dlm_controld ocf:pacemaker:controld \
op start interval="0" timeout="90" \
op stop interval="0" timeout="100" \
op monitor interval="10"
primitive p_drbd_shared ocf:linbit:drbd \
params drbd_resource="shared" \
op monitor interval="10" role="Master" timeout="20" \
op monitor interval="20" role="Slave" timeout="20" \
op start interval="0" timeout="240s" \
op stop interval="0" timeout="120s"
primitive p_drbd_vm_1 ocf:linbit:drbd \
params drbd_resource="vm_1" \
op monitor interval="10" role="Master" timeout="20" \
op monitor interval="20" role="Slave" timeout="20" \
op start interval="0" timeout="240s" \
op stop interval="0" timeout="120s"
primitive p_fs_shared ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/shared" directory="/shared" 
fstype="ocfs2" \
meta target-role="Started" \
op monitor interval="10"
primitive p_ipmi_node1 stonith:external/ipmi \
params hostname="node1" userid="admin" passwd="xxx" ipaddr="10.0.0.2" 
interface="lanplus"
primitive p_ipmi_node2 stonith:external/ipmi \
params hostname="node2" userid="admin" passwd="xxx" ipaddr="10.0.0.3" 
interface="lanplus"
primitive p_libvirtd lsb:libvirt-bin \
op monitor interval="120s" \
op start interval="0" \
op stop interval="0"
primitive p_o2cb ocf:pacemaker:o2cb \
op start interval="0" timeout="90" \
op stop interval="0" timeout="100" \
op monitor interval="10" \
meta target-role="Started"
group g_shared p_dlm_controld p_o2cb p_fs_shared
ms ms_drbd_shared p_drbd_shared \
meta master-max="2" clone-max="2" notify="true"
ms ms_drbd_vm_1 p_drbd_vm_1 \
meta master-max="2" clone-max="2" notify="true"
clone cl_libvirtd p_libvirtd \
meta globally-unique="false" interlave="true"
clone cl_shared g_shared \
meta interleave="true"
location l_ipmi_node1 p_ipmi_node1 -inf: node1
location l_ipmi_node2 p_ipmi_node2 -inf: node2
order o_drbd_before_shared inf: ms_drbd_share

Re: [Pacemaker] Two node KVM cluster

2013-04-15 Thread Andrew Beekhof

On 10/04/2013, at 3:20 PM, Oriol Mula-Valls  wrote:

> On 10/04/13 02:10, Andrew Beekhof wrote:
>> 
>> On 09/04/2013, at 7:31 PM, Oriol Mula-Valls  wrote:
>> 
>>> Thanks Andrew I've managed to set up the system and currently I have it 
>>> working but still on testing.
>>> 
>>> I have configure external/ipmi as fencing device and then I force a reboot 
>>> doing a echo b>  /proc/sysrq-trigger. The fencing is working properly as 
>>> the node is shut off and the VM migrated. However, as soon as I turn on the 
>>> fenced now and the OS has started the surviving is shut down. Is it normal 
>>> or am I doing something wrong?
>> 
>> Can you clarify "turn on the fenced"?
>> 
> 
> To restart the fenced node I do either a power on with ipmitool or I power it 
> on using the iRMC web interface.

Oh, "fenced now" was meant to be "fenced node".  That makes more sense now :)

To answer your question, I would not expect the surviving node to be fenced 
when the previous node returns.
The network between the two is still functional?

> 
>>> 
>>> On the other hand I've seen that in case I completely lose power fencing 
>>> obviously fails. Would SBD stonith solve this issue?
>>> 
>>> Kind regards,
>>> Oriol
>>> 
>>> On 08/04/13 04:11, Andrew Beekhof wrote:
 
 On 03/04/2013, at 9:15 PM, Oriol Mula-Valls   
 wrote:
 
> Hi,
> 
> I've started with Linux HA about one year ago. Currently I'm facing a new 
> project in which I have to set up two nodes with high available virtual 
> machines. I have used as a starting point the Digimer's tutorial 
> (https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial).
> 
> To deploy this new infrastructure I have two Fujitsu Primergy Rx100S7. 
> Both machines have 8GB of RAM and 2x500GB HD. I started creating a 
> software RAID1 with the internal drives and installing Debian 7.0 
> (Wheezy). Apart from the O.S. partition I have created 3 more partitions, 
> one for the shared storage between both machines with OCFS2 and the two 
> other will be used as PVs to create LVs to support the VMs (one for the 
> VMs that will be primary on node1 an the other for primary machines on 
> node2). These 3 partitions are replicated using DRBD.
> 
> The shared storage folder contains:
> * ISO images needed when provisioning VMs
> * scripts used to call virt-install which handles the creation of our VMs.
> * XML definition files which define the emulated hardware backing the VMs
> * old copies of the XML definition files.
> 
> I have more or less done the configuration for the OCFS2 fs and I was 
> about to start the configuration of cLVM for one of the VGs but I have 
> some doubts. I have one dlm for the OCFS2 filesystem, should I create 
> another for cLVM RA?
 
 No, there should only ever be one dlm resource (cloned like you have it)
 
> 
> This is the current configuration:
> node node1
> node node2
> primitive p_dlm_controld ocf:pacemaker:controld \
>   op start interval="0" timeout="90" \
>   op stop interval="0" timeout="100" \
>   op monitor interval="10"
> primitive p_drbd_shared ocf:linbit:drbd \
>   params drbd_resource="shared" \
>   op monitor interval="10" role="Master" timeout="20" \
>   op monitor interval="20" role="Slave" timeout="20" \
>   op start interval="0" timeout="240s" \
>   op stop interval="0" timeout="120s"
> primitive p_drbd_vm_1 ocf:linbit:drbd \
>   params drbd_resource="vm_1" \
>   op monitor interval="10" role="Master" timeout="20" \
>   op monitor interval="20" role="Slave" timeout="20" \
>   op start interval="0" timeout="240s" \
>   op stop interval="0" timeout="120s"
> primitive p_fs_shared ocf:heartbeat:Filesystem \
>   params device="/dev/drbd/by-res/shared" directory="/shared" 
> fstype="ocfs2" \
>   meta target-role="Started" \
>   op monitor interval="10"
> primitive p_ipmi_node1 stonith:external/ipmi \
>   params hostname="node1" userid="admin" passwd="xxx" ipaddr="10.0.0.2" 
> interface="lanplus"
> primitive p_ipmi_node2 stonith:external/ipmi \
>   params hostname="node2" userid="admin" passwd="xxx" ipaddr="10.0.0.3" 
> interface="lanplus"
> primitive p_libvirtd lsb:libvirt-bin \
>   op monitor interval="120s" \
>   op start interval="0" \
>   op stop interval="0"
> primitive p_o2cb ocf:pacemaker:o2cb \
>   op start interval="0" timeout="90" \
>   op stop interval="0" timeout="100" \
>   op monitor interval="10" \
>   meta target-role="Started"
> group g_shared p_dlm_controld p_o2cb p_fs_shared
> ms ms_drbd_shared p_drbd_shared \
>   meta master-max="2" clone-max="2" notify="true"
> ms ms_drbd_vm_1 p_drbd_vm_1 \
>   meta master-max="2" clone-max="2" notify="true"
> clone cl_libvirtd p_libvirtd \
>   meta globally-unique="false" interlave="true"

Re: [Pacemaker] Two node KVM cluster

2013-04-09 Thread Oriol Mula-Valls

On 10/04/13 02:10, Andrew Beekhof wrote:


On 09/04/2013, at 7:31 PM, Oriol Mula-Valls  wrote:


Thanks Andrew I've managed to set up the system and currently I have it working 
but still on testing.

I have configure external/ipmi as fencing device and then I force a reboot doing a 
echo b>  /proc/sysrq-trigger. The fencing is working properly as the node is 
shut off and the VM migrated. However, as soon as I turn on the fenced now and the 
OS has started the surviving is shut down. Is it normal or am I doing something 
wrong?


Can you clarify "turn on the fenced"?



To restart the fenced node I do either a power on with ipmitool or I 
power it on using the iRMC web interface.




On the other hand I've seen that in case I completely lose power fencing 
obviously fails. Would SBD stonith solve this issue?

Kind regards,
Oriol

On 08/04/13 04:11, Andrew Beekhof wrote:


On 03/04/2013, at 9:15 PM, Oriol Mula-Valls   wrote:


Hi,

I've started with Linux HA about one year ago. Currently I'm facing a new 
project in which I have to set up two nodes with high available virtual 
machines. I have used as a starting point the Digimer's tutorial 
(https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial).

To deploy this new infrastructure I have two Fujitsu Primergy Rx100S7. Both 
machines have 8GB of RAM and 2x500GB HD. I started creating a software RAID1 
with the internal drives and installing Debian 7.0 (Wheezy). Apart from the 
O.S. partition I have created 3 more partitions, one for the shared storage 
between both machines with OCFS2 and the two other will be used as PVs to 
create LVs to support the VMs (one for the VMs that will be primary on node1 an 
the other for primary machines on node2). These 3 partitions are replicated 
using DRBD.

The shared storage folder contains:
* ISO images needed when provisioning VMs
* scripts used to call virt-install which handles the creation of our VMs.
* XML definition files which define the emulated hardware backing the VMs
* old copies of the XML definition files.

I have more or less done the configuration for the OCFS2 fs and I was about to 
start the configuration of cLVM for one of the VGs but I have some doubts. I 
have one dlm for the OCFS2 filesystem, should I create another for cLVM RA?


No, there should only ever be one dlm resource (cloned like you have it)



This is the current configuration:
node node1
node node2
primitive p_dlm_controld ocf:pacemaker:controld \
op start interval="0" timeout="90" \
op stop interval="0" timeout="100" \
op monitor interval="10"
primitive p_drbd_shared ocf:linbit:drbd \
params drbd_resource="shared" \
op monitor interval="10" role="Master" timeout="20" \
op monitor interval="20" role="Slave" timeout="20" \
op start interval="0" timeout="240s" \
op stop interval="0" timeout="120s"
primitive p_drbd_vm_1 ocf:linbit:drbd \
params drbd_resource="vm_1" \
op monitor interval="10" role="Master" timeout="20" \
op monitor interval="20" role="Slave" timeout="20" \
op start interval="0" timeout="240s" \
op stop interval="0" timeout="120s"
primitive p_fs_shared ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/shared" directory="/shared" 
fstype="ocfs2" \
meta target-role="Started" \
op monitor interval="10"
primitive p_ipmi_node1 stonith:external/ipmi \
params hostname="node1" userid="admin" passwd="xxx" ipaddr="10.0.0.2" 
interface="lanplus"
primitive p_ipmi_node2 stonith:external/ipmi \
params hostname="node2" userid="admin" passwd="xxx" ipaddr="10.0.0.3" 
interface="lanplus"
primitive p_libvirtd lsb:libvirt-bin \
op monitor interval="120s" \
op start interval="0" \
op stop interval="0"
primitive p_o2cb ocf:pacemaker:o2cb \
op start interval="0" timeout="90" \
op stop interval="0" timeout="100" \
op monitor interval="10" \
meta target-role="Started"
group g_shared p_dlm_controld p_o2cb p_fs_shared
ms ms_drbd_shared p_drbd_shared \
meta master-max="2" clone-max="2" notify="true"
ms ms_drbd_vm_1 p_drbd_vm_1 \
meta master-max="2" clone-max="2" notify="true"
clone cl_libvirtd p_libvirtd \
meta globally-unique="false" interlave="true"
clone cl_shared g_shared \
meta interleave="true"
location l_ipmi_node1 p_ipmi_node1 -inf: node1
location l_ipmi_node2 p_ipmi_node2 -inf: node2
order o_drbd_before_shared inf: ms_drbd_shared:promote cl_shared:start

Packages' versions:
clvm   2.02.95-7
corosync   1.4.2-3
dlm-pcmk   3.0.12-3.2+deb7u2
drbd8-utils2:8.3.13-2
libdlm33.0.12-3.2+deb7u2
libdlmcontrol3 3.0.12-3.2+deb7u2
ocfs2-tools1.6.4-1+deb7u1
ocfs2-tools-pacemaker  1.6.4-1+deb7u1
openais1.1.4-4.

Re: [Pacemaker] Two node KVM cluster

2013-04-09 Thread Andrew Beekhof

On 09/04/2013, at 7:31 PM, Oriol Mula-Valls  wrote:

> Thanks Andrew I've managed to set up the system and currently I have it 
> working but still on testing.
> 
> I have configure external/ipmi as fencing device and then I force a reboot 
> doing a echo b > /proc/sysrq-trigger. The fencing is working properly as the 
> node is shut off and the VM migrated. However, as soon as I turn on the 
> fenced now and the OS has started the surviving is shut down. Is it normal or 
> am I doing something wrong?

Can you clarify "turn on the fenced"?

> 
> On the other hand I've seen that in case I completely lose power fencing 
> obviously fails. Would SBD stonith solve this issue?
> 
> Kind regards,
> Oriol
> 
> On 08/04/13 04:11, Andrew Beekhof wrote:
>> 
>> On 03/04/2013, at 9:15 PM, Oriol Mula-Valls  wrote:
>> 
>>> Hi,
>>> 
>>> I've started with Linux HA about one year ago. Currently I'm facing a new 
>>> project in which I have to set up two nodes with high available virtual 
>>> machines. I have used as a starting point the Digimer's tutorial 
>>> (https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial).
>>> 
>>> To deploy this new infrastructure I have two Fujitsu Primergy Rx100S7. Both 
>>> machines have 8GB of RAM and 2x500GB HD. I started creating a software 
>>> RAID1 with the internal drives and installing Debian 7.0 (Wheezy). Apart 
>>> from the O.S. partition I have created 3 more partitions, one for the 
>>> shared storage between both machines with OCFS2 and the two other will be 
>>> used as PVs to create LVs to support the VMs (one for the VMs that will be 
>>> primary on node1 an the other for primary machines on node2). These 3 
>>> partitions are replicated using DRBD.
>>> 
>>> The shared storage folder contains:
>>> * ISO images needed when provisioning VMs
>>> * scripts used to call virt-install which handles the creation of our VMs.
>>> * XML definition files which define the emulated hardware backing the VMs
>>> * old copies of the XML definition files.
>>> 
>>> I have more or less done the configuration for the OCFS2 fs and I was about 
>>> to start the configuration of cLVM for one of the VGs but I have some 
>>> doubts. I have one dlm for the OCFS2 filesystem, should I create another 
>>> for cLVM RA?
>> 
>> No, there should only ever be one dlm resource (cloned like you have it)
>> 
>>> 
>>> This is the current configuration:
>>> node node1
>>> node node2
>>> primitive p_dlm_controld ocf:pacemaker:controld \
>>> op start interval="0" timeout="90" \
>>> op stop interval="0" timeout="100" \
>>> op monitor interval="10"
>>> primitive p_drbd_shared ocf:linbit:drbd \
>>> params drbd_resource="shared" \
>>> op monitor interval="10" role="Master" timeout="20" \
>>> op monitor interval="20" role="Slave" timeout="20" \
>>> op start interval="0" timeout="240s" \
>>> op stop interval="0" timeout="120s"
>>> primitive p_drbd_vm_1 ocf:linbit:drbd \
>>> params drbd_resource="vm_1" \
>>> op monitor interval="10" role="Master" timeout="20" \
>>> op monitor interval="20" role="Slave" timeout="20" \
>>> op start interval="0" timeout="240s" \
>>> op stop interval="0" timeout="120s"
>>> primitive p_fs_shared ocf:heartbeat:Filesystem \
>>> params device="/dev/drbd/by-res/shared" directory="/shared" 
>>> fstype="ocfs2" \
>>> meta target-role="Started" \
>>> op monitor interval="10"
>>> primitive p_ipmi_node1 stonith:external/ipmi \
>>> params hostname="node1" userid="admin" passwd="xxx" ipaddr="10.0.0.2" 
>>> interface="lanplus"
>>> primitive p_ipmi_node2 stonith:external/ipmi \
>>> params hostname="node2" userid="admin" passwd="xxx" ipaddr="10.0.0.3" 
>>> interface="lanplus"
>>> primitive p_libvirtd lsb:libvirt-bin \
>>> op monitor interval="120s" \
>>> op start interval="0" \
>>> op stop interval="0"
>>> primitive p_o2cb ocf:pacemaker:o2cb \
>>> op start interval="0" timeout="90" \
>>> op stop interval="0" timeout="100" \
>>> op monitor interval="10" \
>>> meta target-role="Started"
>>> group g_shared p_dlm_controld p_o2cb p_fs_shared
>>> ms ms_drbd_shared p_drbd_shared \
>>> meta master-max="2" clone-max="2" notify="true"
>>> ms ms_drbd_vm_1 p_drbd_vm_1 \
>>> meta master-max="2" clone-max="2" notify="true"
>>> clone cl_libvirtd p_libvirtd \
>>> meta globally-unique="false" interlave="true"
>>> clone cl_shared g_shared \
>>> meta interleave="true"
>>> location l_ipmi_node1 p_ipmi_node1 -inf: node1
>>> location l_ipmi_node2 p_ipmi_node2 -inf: node2
>>> order o_drbd_before_shared inf: ms_drbd_shared:promote cl_shared:start
>>> 
>>> Packages' versions:
>>> clvm   2.02.95-7
>>> corosync   1.4.2-3
>>> dlm-pcmk   3.0.12-3.2+deb7u2
>>> drbd8-utils2:8.3.13-2
>>> libdlm33.0.12-3.2+deb7u2
>>> libdlmcontrol3 3.0.12-3.2+deb7u2
>>> ocfs2-tools

Re: [Pacemaker] Two node KVM cluster

2013-04-09 Thread Oriol Mula-Valls
Thanks Andrew I've managed to set up the system and currently I have it 
working but still on testing.


I have configure external/ipmi as fencing device and then I force a 
reboot doing a echo b > /proc/sysrq-trigger. The fencing is working 
properly as the node is shut off and the VM migrated. However, as soon 
as I turn on the fenced now and the OS has started the surviving is shut 
down. Is it normal or am I doing something wrong?


On the other hand I've seen that in case I completely lose power fencing 
obviously fails. Would SBD stonith solve this issue?


Kind regards,
Oriol

On 08/04/13 04:11, Andrew Beekhof wrote:


On 03/04/2013, at 9:15 PM, Oriol Mula-Valls  wrote:


Hi,

I've started with Linux HA about one year ago. Currently I'm facing a new 
project in which I have to set up two nodes with high available virtual 
machines. I have used as a starting point the Digimer's tutorial 
(https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial).

To deploy this new infrastructure I have two Fujitsu Primergy Rx100S7. Both 
machines have 8GB of RAM and 2x500GB HD. I started creating a software RAID1 
with the internal drives and installing Debian 7.0 (Wheezy). Apart from the 
O.S. partition I have created 3 more partitions, one for the shared storage 
between both machines with OCFS2 and the two other will be used as PVs to 
create LVs to support the VMs (one for the VMs that will be primary on node1 an 
the other for primary machines on node2). These 3 partitions are replicated 
using DRBD.

The shared storage folder contains:
* ISO images needed when provisioning VMs
* scripts used to call virt-install which handles the creation of our VMs.
* XML definition files which define the emulated hardware backing the VMs
* old copies of the XML definition files.

I have more or less done the configuration for the OCFS2 fs and I was about to 
start the configuration of cLVM for one of the VGs but I have some doubts. I 
have one dlm for the OCFS2 filesystem, should I create another for cLVM RA?


No, there should only ever be one dlm resource (cloned like you have it)



This is the current configuration:
node node1
node node2
primitive p_dlm_controld ocf:pacemaker:controld \
op start interval="0" timeout="90" \
op stop interval="0" timeout="100" \
op monitor interval="10"
primitive p_drbd_shared ocf:linbit:drbd \
params drbd_resource="shared" \
op monitor interval="10" role="Master" timeout="20" \
op monitor interval="20" role="Slave" timeout="20" \
op start interval="0" timeout="240s" \
op stop interval="0" timeout="120s"
primitive p_drbd_vm_1 ocf:linbit:drbd \
params drbd_resource="vm_1" \
op monitor interval="10" role="Master" timeout="20" \
op monitor interval="20" role="Slave" timeout="20" \
op start interval="0" timeout="240s" \
op stop interval="0" timeout="120s"
primitive p_fs_shared ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/shared" directory="/shared" 
fstype="ocfs2" \
meta target-role="Started" \
op monitor interval="10"
primitive p_ipmi_node1 stonith:external/ipmi \
params hostname="node1" userid="admin" passwd="xxx" ipaddr="10.0.0.2" 
interface="lanplus"
primitive p_ipmi_node2 stonith:external/ipmi \
params hostname="node2" userid="admin" passwd="xxx" ipaddr="10.0.0.3" 
interface="lanplus"
primitive p_libvirtd lsb:libvirt-bin \
op monitor interval="120s" \
op start interval="0" \
op stop interval="0"
primitive p_o2cb ocf:pacemaker:o2cb \
op start interval="0" timeout="90" \
op stop interval="0" timeout="100" \
op monitor interval="10" \
meta target-role="Started"
group g_shared p_dlm_controld p_o2cb p_fs_shared
ms ms_drbd_shared p_drbd_shared \
meta master-max="2" clone-max="2" notify="true"
ms ms_drbd_vm_1 p_drbd_vm_1 \
meta master-max="2" clone-max="2" notify="true"
clone cl_libvirtd p_libvirtd \
meta globally-unique="false" interlave="true"
clone cl_shared g_shared \
meta interleave="true"
location l_ipmi_node1 p_ipmi_node1 -inf: node1
location l_ipmi_node2 p_ipmi_node2 -inf: node2
order o_drbd_before_shared inf: ms_drbd_shared:promote cl_shared:start

Packages' versions:
clvm   2.02.95-7
corosync   1.4.2-3
dlm-pcmk   3.0.12-3.2+deb7u2
drbd8-utils2:8.3.13-2
libdlm33.0.12-3.2+deb7u2
libdlmcontrol3 3.0.12-3.2+deb7u2
ocfs2-tools1.6.4-1+deb7u1
ocfs2-tools-pacemaker  1.6.4-1+deb7u1
openais1.1.4-4.1
pacemaker  1.1.7-1

As this is my first serious set up suggestions are more than welcome.

Thanks for your help.

Oriol
--
Oriol Mula Valls
Institut Català de Ciències del Clima (IC3)
Doctor Trueta 203 - 08005 Barcelona
Tel:+34 

Re: [Pacemaker] Two node KVM cluster

2013-04-07 Thread Andrew Beekhof

On 03/04/2013, at 9:15 PM, Oriol Mula-Valls  wrote:

> Hi,
> 
> I've started with Linux HA about one year ago. Currently I'm facing a new 
> project in which I have to set up two nodes with high available virtual 
> machines. I have used as a starting point the Digimer's tutorial 
> (https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial).
> 
> To deploy this new infrastructure I have two Fujitsu Primergy Rx100S7. Both 
> machines have 8GB of RAM and 2x500GB HD. I started creating a software RAID1 
> with the internal drives and installing Debian 7.0 (Wheezy). Apart from the 
> O.S. partition I have created 3 more partitions, one for the shared storage 
> between both machines with OCFS2 and the two other will be used as PVs to 
> create LVs to support the VMs (one for the VMs that will be primary on node1 
> an the other for primary machines on node2). These 3 partitions are 
> replicated using DRBD.
> 
> The shared storage folder contains:
> * ISO images needed when provisioning VMs
> * scripts used to call virt-install which handles the creation of our VMs.
> * XML definition files which define the emulated hardware backing the VMs
> * old copies of the XML definition files.
> 
> I have more or less done the configuration for the OCFS2 fs and I was about 
> to start the configuration of cLVM for one of the VGs but I have some doubts. 
> I have one dlm for the OCFS2 filesystem, should I create another for cLVM RA?

No, there should only ever be one dlm resource (cloned like you have it)

> 
> This is the current configuration:
> node node1
> node node2
> primitive p_dlm_controld ocf:pacemaker:controld \
>   op start interval="0" timeout="90" \
>   op stop interval="0" timeout="100" \
>   op monitor interval="10"
> primitive p_drbd_shared ocf:linbit:drbd \
>   params drbd_resource="shared" \
>   op monitor interval="10" role="Master" timeout="20" \
>   op monitor interval="20" role="Slave" timeout="20" \
>   op start interval="0" timeout="240s" \
>   op stop interval="0" timeout="120s"
> primitive p_drbd_vm_1 ocf:linbit:drbd \
>   params drbd_resource="vm_1" \
>   op monitor interval="10" role="Master" timeout="20" \
>   op monitor interval="20" role="Slave" timeout="20" \
>   op start interval="0" timeout="240s" \
>   op stop interval="0" timeout="120s"
> primitive p_fs_shared ocf:heartbeat:Filesystem \
>   params device="/dev/drbd/by-res/shared" directory="/shared" 
> fstype="ocfs2" \
>   meta target-role="Started" \
>   op monitor interval="10"
> primitive p_ipmi_node1 stonith:external/ipmi \
>   params hostname="node1" userid="admin" passwd="xxx" ipaddr="10.0.0.2" 
> interface="lanplus"
> primitive p_ipmi_node2 stonith:external/ipmi \
>   params hostname="node2" userid="admin" passwd="xxx" ipaddr="10.0.0.3" 
> interface="lanplus"
> primitive p_libvirtd lsb:libvirt-bin \
>   op monitor interval="120s" \
>   op start interval="0" \
>   op stop interval="0"
> primitive p_o2cb ocf:pacemaker:o2cb \
>   op start interval="0" timeout="90" \
>   op stop interval="0" timeout="100" \
>   op monitor interval="10" \
>   meta target-role="Started"
> group g_shared p_dlm_controld p_o2cb p_fs_shared
> ms ms_drbd_shared p_drbd_shared \
>   meta master-max="2" clone-max="2" notify="true"
> ms ms_drbd_vm_1 p_drbd_vm_1 \
>   meta master-max="2" clone-max="2" notify="true"
> clone cl_libvirtd p_libvirtd \
>   meta globally-unique="false" interlave="true"
> clone cl_shared g_shared \
>   meta interleave="true"
> location l_ipmi_node1 p_ipmi_node1 -inf: node1
> location l_ipmi_node2 p_ipmi_node2 -inf: node2
> order o_drbd_before_shared inf: ms_drbd_shared:promote cl_shared:start
> 
> Packages' versions:
> clvm   2.02.95-7
> corosync   1.4.2-3
> dlm-pcmk   3.0.12-3.2+deb7u2
> drbd8-utils2:8.3.13-2
> libdlm33.0.12-3.2+deb7u2
> libdlmcontrol3 3.0.12-3.2+deb7u2
> ocfs2-tools1.6.4-1+deb7u1
> ocfs2-tools-pacemaker  1.6.4-1+deb7u1
> openais1.1.4-4.1
> pacemaker  1.1.7-1
> 
> As this is my first serious set up suggestions are more than welcome.
> 
> Thanks for your help.
> 
> Oriol
> -- 
> Oriol Mula Valls
> Institut Català de Ciències del Clima (IC3)
> Doctor Trueta 203 - 08005 Barcelona
> Tel:+34 93 567 99 77
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker