[Pacemaker] SBD fencing with stonith disabled

2013-11-19 Thread Angel L. Mateo

Hello,

I have a two node cluster based on cman+pacemaker running on ubuntu 
12.04.

	Last weekend my active node was shutdown, even while I was stonith 
disabled. In my config I have:


...
primitive stonith_sbd stonith:external/sbd \
params 
sbd_device=/dev/disk/by-id/wwn-0x60002ac000356f6d-part1 \

meta target-role=Started
...
property $id=cib-bootstrap-options \
dc-version=1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c \
cluster-infrastructure=cman \
expected-quorum-votes=2 \
no-quorum-policy=ignore \
stonith-enabled=false \
last-lrm-refresh=1384411940 \
maintenance-mode=false
rsc_defaults $id=rsc-options \
resource-stickiness=100

But my node was halted after the message:

Nov 16 12:20:47 myotis51 sbd: [1377]: WARN: Latency: No liveness for 4 s 
exceeds threshold of 3 s (

healthy servants: 0)

Should I stop sbd daemon even if I have stonith-enabled=false?

--
Angel L. Mateo Martínez
Sección de Telemática
Área de Tecnologías de la Información
y las Comunicaciones Aplicadas (ATICA)
http://www.um.es/atica
Tfo: 868887590
Fax: 86337

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] SBD fencing with stonith disabled

2013-11-19 Thread Angel L. Mateo

El 19/11/13 09:44, Lars Marowsky-Bree escribió:

On 2013-11-19T09:27:10, Angel L. Mateo ama...@um.es wrote:


property $id=cib-bootstrap-options \
 dc-version=1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c \


Wow, that's quite old.


It's pacemaker provided by ubuntu 12.04.


Nov 16 12:20:47 myotis51 sbd: [1377]: WARN: Latency: No liveness for 4 s
exceeds threshold of 3 s (
healthy servants: 0)

Should I stop sbd daemon even if I have stonith-enabled=false?


stonith-enabled=false does not disable sbd's self-fencing in case of
lost devices. (I think I'd be willing to take a patch if it isn't too
convoluted.)

	My cluster's nodes are in vmware vsphere virtual machines. Could I use 
another stonith device, like external/vcenter? Is there any 
recommendation about it?



You may want to consider using the -P option to enable pacemaker
integration though; that could also make things better.


Is this a sbd's option? I can't see that options (or it is 
undocumented):

amateo_adm@myotis51:/var/log$ sbd --help
sbd: invalid option -- '-'
Shared storage fencing tool.
Syntax:
sbd options command cmdarguments
Options:
-d devname  Block device to use (mandatory; can be specified up to 3 
times)
-h  Display this help.
-n node Set local node name; defaults to uname -n (optional)

-R  Do NOT enable realtime priority (debugging only)
-W  Use watchdog (recommended) (watch only)
-w dev  Specify watchdog device (optional) (watch only)
-T  Do NOT initialize the watchdog timeout (watch only)
-v  Enable some verbose debug logging (optional)

-1 NSet watchdog timeout to N seconds (optional, create only)
-2 NSet slot allocation timeout to N seconds (optional, create 
only)
-3 NSet daemon loop timeout to N seconds (optional, create only)
-4 NSet msgwait timeout to N seconds (optional, create only)
-5 NWarn if loop latency exceeds threshold (optional, watch only)
(default is 3, set to 0 to disable)
-t NInterval in seconds for automatic child restarts (optional)
(default is 3600, set to 0 to disable)
Commands:
create  initialize N slots on dev - OVERWRITES DEVICE!
listList all allocated slots on device, and messages.
dumpDump meta-data header from device.
watch   Loop forever, monitoring own slot
allocate node
Allocate a slot for node (optional)
message node (test|reset|off|clear|exit)
Writes the specified message to node's slot.


Note that not running the sbd daemon and setting stonith-enabled=true
again will yield false STONITH successes. You're really not encouraged
to do that, or only very carefully.

	Yes, I know it. I had stonith disabled (but sbd was running) because 
I'm having latency problems with my fiber channel disks, so I wanted to 
debug them without unnecessary reboots.


--
Angel L. Mateo Martínez
Sección de Telemática
Área de Tecnologías de la Información
y las Comunicaciones Aplicadas (ATICA)
http://www.um.es/atica
Tfo: 868887590
Fax: 86337

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] SBD fencing with stonith disabled

2013-11-19 Thread Angel L. Mateo

El 19/11/13 11:34, Lars Marowsky-Bree escribió:

On 2013-11-19T11:25:36, Angel L. Mateo ama...@um.es wrote:


property $id=cib-bootstrap-options \
 dc-version=1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c \

Wow, that's quite old.

It's pacemaker provided by ubuntu 12.04.


Yeah, well. Still old. Probably something to complain to about to the
distribution maintainers.


My cluster's nodes are in vmware vsphere virtual machines. Could I use
another stonith device, like external/vcenter? Is there any recommendation
about it?


Yes, you should also be able to use that.

	But is it recommended for a two node cluster? I remember me reading in 
some place that in such scenario is better a sbd stonith because it 
provides mechanism (but I could be wrong)


--
Angel L. Mateo Martínez
Sección de Telemática
Área de Tecnologías de la Información
y las Comunicaciones Aplicadas (ATICA)
http://www.um.es/atica
Tfo: 868887590
Fax: 86337

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Problems with SBD fencing

2013-08-20 Thread Angel L. Mateo

El 06/08/13 13:49, Jan Christian Kaldestad escribió:

In my case this does not work - read my original post. So I wonder if
there is a pacemaker bug (version 1.1.9-2db99f1). Killing pengine and
stonithd on the node which is supposed to shoot seems to resolve the
problem, though this is not a solution of course.

I also tested two separate stonith resources, one on each node. This
stonith'ing works fine with this configuration. Is there somehing
wrong about doing it this way?


Are you sure you have property stonith-enabled=true?

--
Angel L. Mateo Martínez
Sección de Telemática
Área de Tecnologías de la Información
y las Comunicaciones Aplicadas (ATICA)
http://www.um.es/atica
Tfo: 868889150
Fax: 86337

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Problems with SBD fencing

2013-08-20 Thread Angel L. Mateo

El 06/08/13 13:49, Jan Christian Kaldestad escribió:

In my case this does not work - read my original post. So I wonder if
there is a pacemaker bug (version 1.1.9-2db99f1). Killing pengine and
stonithd on the node which is supposed to shoot seems to resolve the
problem, though this is not a solution of course.

I also tested two separate stonith resources, one on each node. This
stonith'ing works fine with this configuration. Is there somehing
wrong about doing it this way?


For me to work (ubuntu 12.04) I had to create /etc/sysconfig/sbd file 
with:

SBD_DEVICE=/dev/disk/by-id/wwn-0x6006016009702500a4227a04c6b0e211-part1
SBD_OPTS=-W

and the resource configuration is

primitive stonith_sbd stonith:external/sbd \
params 
sbd_device=/dev/disk/by-id/wwn-0x6006016009702500a4227a04c6b0e211-part1 \

meta target-role=Started

	Where /dev/disk/by-id/wwn-0x6006016009702500a4227a04c6b0e211-part1 is 
my disk device.


--
Angel L. Mateo Martínez
Sección de Telemática
Área de Tecnologías de la Información
y las Comunicaciones Aplicadas (ATICA)
http://www.um.es/atica
Tfo: 868889150
Fax: 86337

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Disabling failover for a resource

2013-05-30 Thread Angel L. Mateo

El 27/05/13 09:28, Michael Schwartzkopff escribió:

Am Montag, 27. Mai 2013, 09:19:41 schrieb Angel L. Mateo:

  Hello,

 

  I have configured a active/passive cluster for my dovecot server. Now I

  want to add to it a resource for running the backup service. I want this

  resource to be run on the same node that the dovecot resource, but I

  don't want it to produce any failover. I mean, if dovecot resource is

  move to another node, then it should be moved too; but it the backup

  resource fails, then nothing has to be done with other resources.

 

  Is it enough for this just disabling monitoring in the resource?

No. Do it proper.

1) Make a colocation for the backup resource to dovecot:

col col_backup_dovecot inf: res_Backup res_Dovecot

2) Prevent the backup resource running on node2:

loc loc_Backup resBackup -inf: node2

	I tried this. This way I can't even move res_dovecot to node2, I guess 
because res_backup can't run on node2.


	What I want is not forbid res_backup to run on node2, but just avoid it 
to produce a failover. But it the failover is because of res_dovecot, 
then res_backup should run on node2.



--
Angel L. Mateo Martínez
Sección de Telemática
Área de Tecnologías de la Información
y las Comunicaciones Aplicadas (ATICA)
http://www.um.es/atica
Tfo: 868887590
Fax: 86337

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Disabling failover for a resource

2013-05-27 Thread Angel L. Mateo

El 27/05/13 09:28, Michael Schwartzkopff escribió:

Am Montag, 27. Mai 2013, 09:19:41 schrieb Angel L. Mateo:

  Hello,

 

  I have configured a active/passive cluster for my dovecot server. Now I

  want to add to it a resource for running the backup service. I want this

  resource to be run on the same node that the dovecot resource, but I

  don't want it to produce any failover. I mean, if dovecot resource is

  move to another node, then it should be moved too; but it the backup

  resource fails, then nothing has to be done with other resources.

 

  Is it enough for this just disabling monitoring in the resource?

No. Do it proper.

1) Make a colocation for the backup resource to dovecot:

col col_backup_dovecot inf: res_Backup res_Dovecot

2) Prevent the backup resource running on node2:

loc loc_Backup resBackup -inf: node2

	But I want the backup resource to be run on node2 when dovecot resource 
is running on node2. Is it possible with this configuration?



--
Angel L. Mateo Martínez
Sección de Telemática
Área de Tecnologías de la Información
y las Comunicaciones Aplicadas (ATICA)
http://www.um.es/atica
Tfo: 868889150
Fax: 86337

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] cman + corosync + pacemaker + fence_scsi

2013-04-26 Thread Angel L. Mateo

El 26/04/13 02:01, Andrew Beekhof escribió:


On 24/04/2013, at 10:48 PM, Angel L. Mateo ama...@um.es wrote:


Hello,

I'm trying to configure a 2 node cluster in ubuntu with cman + corosync 
+ pacemaker (the use of cman is because it is recommended at pacemaker 
quickstart). In order to solve the split brain in the 2 node cluster I'm using 
qdisk.


If you want to use qdisk, then you need something newer than 1.1.8 (which did 
not know how to filter qdisk from the membership).

	Oopps. I have cman 3.1.7, corosync 1.4.2 and pacemaker 1.1.6 (the ones 
provided with ubuntu 12.04).


	My purpose for using qdisk is to solve split brain problem in my two 
nodes cluster. Another suggestion for this?





For fencing, I'm trying to use fence_scsi and in this point I'm having the 
problem. I have attached my cluster.conf.

xml node id=/dev/block/8:33 type=normal uname=/dev/block/8:33/
node myotis51
node myotis52
primitive cluster_ip ocf:heartbeat:IPaddr2 \
params ip=155.54.211.167 \
op monitor interval=30s
property $id=cib-bootstrap-options \
dc-version=1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c \
cluster-infrastructure=cman \
stonith-enabled=false \
last-lrm-refresh=1366803979

At this moment I'm trying just with an IP resource, but at the end I'll 
get LVM resources and dovecot server running in top of them.

The problem I have is that whenever I interrupt network traffic between 
my nodes (to check if quorum and fencing is working) the IP resource is started 
in both nodes of the cluster.


Do both side claim to have quorum?
Also, had you enabled fencing the cluster would have shot its peer before 
trying to start the IP.

	I think I did (and this configuration with stonith disabled is because 
modified for later tests) but I will check it again.




So it seems that node fencing configure at cluster.conf is not working 
for me.


Because pacemaker cannot use it from there.
You need to follow


http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/_configuring_cman_fencing.html

and then teach pacemaker about fence_scsi:


http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/ch09.html


Then I have tried to configure as a stonith resource (since it is listed by 
sudo crm ra list stonith), so I have tried to include

primitive stonith_fence_scsi stonith:redhat/fence_scsi

The problem I'm having with this is that I don't know how to indicate params for 
the resource (I have tried params devices=..., params -d ..., but they are 
not accepted) and with this (default) configuration I get:


See the above link to chapter 9.

	I have tried this. The problem I'm having is that I don't know how to 
create the resource using fence_scsi. I have tried different syntaxes


crm(live)configure# primitive stonith_fence_scsi stonith:redhat/fence_scsi \
 params name=scsi_fence devices=/dev/sdc
ERROR: stonith_fence_scsi: parameter name does not exist
ERROR: stonith_fence_scsi: parameter devices does not exist

crm(live)configure# primitive stonith_fence_scsi stonith:redhat/fence_scsi \
 params n=scsi_fence d=/dev/sdc
ERROR: stonith_fence_scsi: parameter d does not exist
ERROR: stonith_fence_scsi: parameter n does not exist

crm(live)configure# primitive stonith_fence_scsi stonith:redhat/fence_scsi \
 params -n=scsi_fence -d=/dev/sdc
ERROR: stonith_fence_scsi: parameter -d does not exist
ERROR: stonith_fence_scsi: parameter -n does not exist

	Does anyone has an example for this? What I would like to do is that in 
case of problems, the node with the use of scsi channel (the one using 
my LMV volumes) shoots the other one. Could I use the same behaviour 
with external/sbd stonith resource?


--
Angel L. Mateo Martínez
Sección de Telemática
Área de Tecnologías de la Información
y las Comunicaciones Aplicadas (ATICA)
http://www.um.es/atica
Tfo: 868889150
Fax: 86337

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] best setup for corosync + pacemaker in ubuntu 12.04

2013-04-25 Thread Angel L. Mateo

Hello everbody,

	As suggested by Andreas Mock in a previous thread... what is the best 
setup for corosync and pacemaker in a VM running ubuntu 12.04?


	In pacemaker's quickstart 
(http://clusterlabs.org/quickstart-ubuntu.html) it is best adding cman 
to the configuration, but I'm not sure about this.


What do you think?

--
Angel L. Mateo Martínez
Sección de Telemática
Área de Tecnologías de la Información
y las Comunicaciones Aplicadas (ATICA)
http://www.um.es/atica
Tfo: 868889150
Fax: 86337

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-26 Thread Angel L. Mateo

El 25/03/13 20:50, Jacek Konieczny escribió:

On Mon, 25 Mar 2013 20:01:28 +0100
Angel L. Mateo ama...@um.es wrote:

quorum {
provider: corosync_votequorum
expected_votes: 2
two_node: 1
}

Corosync will then manage quorum for the two-node cluster and
Pacemaker


   I'm using corosync 1.1 which is the one  provided with my
distribution (ubuntu 12.04). I could also use cman.


I don't think corosync 1.1 can do that, but I guess in this case cman
should be able provide this functionality.


Sorry, it's corosync 1.4, not 1.1.


can use that. You still need proper fencing to enforce the quorum
(both for pacemaker and the storage layer – dlm in case you use
clvmd), but no
extra quorum node is needed.


   I hace configured a dlm resource usted with clvm.

   One doubt... With this configuration, how split brain problem is
handled?


The first node to notice that the other is unreachable will fence (kill)
the other, making sure it is the only one operating on the shared data.
Even though it is only half of the node, the cluster is considered
quorate as the other node is known not to be running any cluster
resources.

When the fenced node reboots its cluster stack starts, but with no
quorum until it comminicates with the surviving node again. So no
cluster services are started there until both nodes communicate
properly and the proper quorum is recovered.

	But, will this work with corosync 1.4? Alghtough with corosync 1.4 I 
may won't be able to use quorum configuration you said (I'll try), I 
have configured no-quorum-policy=ignore so the cluster could still run 
in the case of one node failing. Could this be a problem?


--
Angel L. Mateo Martínez
Sección de Telemática
Área de Tecnologías de la Información
y las Comunicaciones Aplicadas (ATICA)
http://www.um.es/atica
Tfo: 868889150
Fax: 86337

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-25 Thread Angel L. Mateo

Hello,

	I am newbie with pacemaker (and, generally, with ha clusters). I have 
configured a two nodes cluster. Both nodes are virtual machines (vmware 
esx) and use a shared storage (provided by a SAN, although access to the 
SAN is from esx infrastructure and VM consider it as scsi disk). I have 
configured clvm so logical volumes are only active in one of the nodes.


	Now I need some help with the stonith configuration to avoid data 
corrumption. Since I'm using ESX virtual machines, I think I won't have 
any problem using external/vcenter stonith plugin to shutdown virtual 
machines.


	My problem is how to avoid split brain situation with this 
configuration, without configuring a 3rd node. I have read about quorum 
disks, external/sbd stonith plugin and other references, but I'm too 
confused with all this.


	For example, [1] mention techniques to improve quorum with scsi reserve 
or quorum daemon, but it didn't point to how to do this pacemaker. Or 
[2] talks about external/sbd.


Any help?

PS: I have attached my corosync.conf and crm configure show outputs

[1] 
http://techthoughts.typepad.com/managing_computers/2007/10/split-brain-quo.html

[2] http://www.gossamer-threads.com/lists/linuxha/pacemaker/78887

--
Angel L. Mateo Martínez
Sección de Telemática
Área de Tecnologías de la Información
y las Comunicaciones Aplicadas (ATICA)
http://www.um.es/atica
Tfo: 868889150
Fax: 86337
# Please read the openais.conf.5 manual page

totem {
version: 2

# How long before declaring a token lost (ms)
token: 3000

# How many token retransmits before forming a new configuration
token_retransmits_before_loss_const: 10

# How long to wait for join messages in the membership protocol (ms)
join: 60

# How long to wait for consensus to be achieved before starting a new 
round of membership configuration (ms)
consensus: 3600

# Turn off the virtual synchrony filter
vsftype: none

# Number of messages that may be sent by one processor on receipt of 
the token
max_messages: 20

# Limit generated nodeids to 31-bits (positive signed integers)
clear_node_high_bit: yes

# Disable encryption
secauth: off

# How many threads to use for encryption/decryption
threads: 0

# Optionally assign a fixed node id (integer)
# nodeid: 1234

# This specifies the mode of redundant ring, which may be none, active, 
or passive.
rrp_mode: none

interface {
# The following values need to be set based on your environment 
ringnumber: 0
bindnetaddr: 155.54.211.160
mcastaddr: 226.94.1.1
mcastport: 5405
}
}

amf {
mode: disabled
}

service {
# Load the Pacemaker Cluster Resource Manager
ver:   1
name:  pacemaker
}

aisexec {
user:   root
group:  root
}

logging {
fileline: off
to_stderr: yes
to_logfile: no
to_syslog: yes
syslog_facility: daemon
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
tags: enter|leave|trace1|trace2|trace3|trace4|trace6
}
}
node myotis51
node myotis52
primitive clvm ocf:lvm2:clvmd \
params daemon_timeout=30 \
meta target-role=Started
primitive dlm ocf:pacemaker:controld \
meta target-role=Started
primitive vg_users1 ocf:heartbeat:LVM \
params volgrpname=UsersDisk exclusive=yes \
op monitor interval=60 timeout=60
group dlm-clvm dlm clvm
clone dlm-clvm-clone dlm-clvm \
meta interleave=true ordered=true target-role=Started
location cli-prefer-vg_users1 vg_users1 \
rule $id=cli-prefer-rule-vg_users1 inf: #uname eq myotis52
property $id=cib-bootstrap-options \
dc-version=1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c \
cluster-infrastructure=openais \
expected-quorum-votes=2 \
stonith-enabled=false \
no-quorum-policy=ignore \
last-lrm-refresh=1364212376
rsc_defaults $id=rsc-options \
resource-stickiness=100

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-25 Thread Angel L. Mateo


Jacek Konieczny jaj...@jajcus.net escribió:

On Mon, 25 Mar 2013 13:54:22 +0100
  My problem is how to avoid split brain situation with this 
 configuration, without configuring a 3rd node. I have read about
 quorum disks, external/sbd stonith plugin and other references, but
 I'm too confused with all this.
 
  For example, [1] mention techniques to improve quorum with
 scsi reserve or quorum daemon, but it didn't point to how to do this
 pacemaker. Or [2] talks about external/sbd.
 
  Any help?


With corosync 2.2 (2.1 too, I guess) you can use, in corosync.conf:

quorum {
   provider: corosync_votequorum
   expected_votes: 2
   two_node: 1
}

Corosync will then manage quorum for the two-node cluster and Pacemaker

  I'm using corosync 1.1 which is the one  provided with my distribution 
(ubuntu 12.04). I could also use cman.

can use that. You still need proper fencing to enforce the quorum (both
for pacemaker and the storage layer – dlm in case you use clvmd), but
no
extra quorum node is needed.

  I hace configured a dlm resource usted with clvm.

  One doubt... With this configuration, how split brain problem is handled?

There is one more thing, though: you need two nodes active to boot the
cluster, but then when one fails (and is fenced) the other may
continue,
keeping quorum.

Greets,
   Jacek

-- 
Enviado desde mi teléfono Android con K-9 Mail.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] pacemaker + corosync + clvm in ubuntu

2013-03-23 Thread Angel L. Mateo
 I can't finde this packet in ubuntu repos. I'll try to use debian's one

emmanuel segura emi2f...@gmail.com escribió:

Hello Angel

I'm using debian, i don't know if the result on ubuntu is the same, try

apt-file search dlm_controld.pcmk

Result should be:
dlm-pcmk: /usr/sbin/dlm_controld.pcmk


2013/3/22 Angel L. Mateo ama...@um.es

 Hello,

 I'm trying to configure a cluster based in pacemaker and
corosync
 in two ubuntu precise servers. The cluster is for an active/standby
 pop/imap server with a shared storage accesed through fibrechannel.

 In order to avoid concurrent access to this shared storage, I
need
 clvm (maybe I'm wrong), so I'm trying to configure it. According to
 different guides and howtos I have found I have configured a DLM and
clvm
 resource:

 root@myotis51:/etc/cluster# crm configure show
 node myotis51
 node myotis52
 primitive clvm ocf:lvm2:clvmd \
 params daemon_timeout=30 \
 meta target-role=Started
 primitive dlm ocf:pacemaker:controld \
 meta target-role=Started
 group dlm-clvm dlm clvm
 clone dlm-clvm-clone dlm-clvm \
 meta interleave=true ordered=true
 property $id=cib-bootstrap-options \

dc-version=1.1.6-**9971ebba4494012a93c03b40a2c58e**c0eb60f50c \
 cluster-infrastructure=cman \
 expected-quorum-votes=2 \
 stonith-enabled=false \
 no-quorum-policy=ignore \
 last-lrm-refresh=1363957949
 rsc_defaults $id=rsc-options \
 resource-stickiness=100

 With this configuration, resources are not launched, I think
 because DLM is failing because it's trying to launch
dlm_controld.pcmk
 which is not installed in my system

 Mar 22 13:52:57 myotis51 pengine: [2989]: notice: LogActions: Start
dlm:0
   (myotis51)
 Mar 22 13:52:57 myotis51 pengine: [2989]: notice: LogActions: Leave
dlm:1
   (Stopped)
 Mar 22 13:52:57 myotis51 crmd: [2990]: info: te_rsc_command:
Initiating
 action 4: monitor dlm:0_monitor_0 on myotis51 (local)
 Mar 22 13:52:57 myotis51 crmd: [2990]: info: do_lrm_rsc_op:
Performing
 key=4:0:7:71fa2334-a3f3-4c01-**a000-7e702a32d0e2 op=dlm:0_monitor_0 )
 Mar 22 13:52:57 myotis51 lrmd: [2987]: info: rsc:dlm:0 probe[2] (pid
3050)
 Mar 22 13:52:57 myotis51 controld[3050]: ERROR: Setup problem:
couldn't
 find command: dlm_controld.pcmk
 Mar 22 13:52:57 myotis51 lrmd: [2987]: info: operation monitor[2] on
dlm:0
 for client 2990: pid 3050 exited with return code 5
 Mar 22 13:52:57 myotis51 crmd: [2990]: info: process_lrm_event: LRM
 operation dlm:0_monitor_0 (call=2, rc=5, cib-update=27,
confirmed=true) not
 installed
 Mar 22 13:52:57 myotis51 crmd: [2990]: WARN: status_from_rc: Action 4
 (dlm:0_monitor_0) on myotis51 failed (target: 7 vs. rc: 5): Error
 Mar 22 13:52:57 myotis51 crmd: [2990]: info: abort_transition_graph:
 match_graph_event:277 - Triggered transition abort (complete=0,
 tag=lrm_rsc_op, id=dlm:0_last_failure_0,
magic=0:5;4:0:7:71fa2334-a3f3-**4c01-a000-7e702a32d0e2,
 cib=0.32.14) : Event failed
 Mar 22 13:52:57 myotis51 crmd: [2990]: info: match_graph_event:
Action
 dlm:0_monitor_0 (4) confirmed on myotis51 (rc=4)
 Mar 22 13:52:57 myotis51 pengine: [2989]: notice: unpack_rsc_op: Hard
 error - dlm:0_last_failure_0 failed with rc=5: Preventing
dlm-clvm-clone
 from re-starting on myotis51
 Mar 22 13:52:57 myotis51 pengine: [2989]: notice: LogActions: Leave
dlm:0
   (Stopped)
 Mar 22 13:52:57 myotis51 pengine: [2989]: notice: LogActions: Leave
dlm:1
   (Stopped)

 The problem with this is that I can't find any
dlm_controld.pcmk
 binary for ubuntu. Any idea on how to fix this?

 The closest command I have found is dlm_controld provided
with
 cman packages, but then I have to replace corosync with cman. Doing
this is
 not a big problem for me. The fact is that I'm newbie in HA and the
use of
 corosync instead of cman is because this is the one documented with
 pacemaker (http://clusterlabs.org/doc/**en-US/Pacemaker/1.1-plugin/**

html/Clusters_from_Scratch/**index.htmlhttp://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/index.html
 ).

 Is it corosync supposed to be better (or more open or more
 standard based) than cman? In case corosync is more recommended, then
what
 is the solution for the dlm problem?

 Thanks in advanced.

 --
 Angel L. Mateo Martínez
 Sección de Telemática
 Área de Tecnologías de la Información
 y las Comunicaciones Aplicadas (ATICA)
 http://www.um.es/atica
 Tfo: 868889150
 Fax: 86337

 __**_
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org

http://oss.clusterlabs.org/**mailman/listinfo/pacemakerhttp://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started:
http://www.clusterlabs.org/**doc/Cluster_from_Scratch.pdfhttp://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org




-- 
esta es mi vida e me la vivo hasta que dios

[Pacemaker] pacemaker + corosync + clvm in ubuntu

2013-03-22 Thread Angel L. Mateo

Hello,

	I'm trying to configure a cluster based in pacemaker and corosync in 
two ubuntu precise servers. The cluster is for an active/standby 
pop/imap server with a shared storage accesed through fibrechannel.


	In order to avoid concurrent access to this shared storage, I need clvm 
(maybe I'm wrong), so I'm trying to configure it. According to different 
guides and howtos I have found I have configured a DLM and clvm resource:


root@myotis51:/etc/cluster# crm configure show
node myotis51
node myotis52
primitive clvm ocf:lvm2:clvmd \
params daemon_timeout=30 \
meta target-role=Started
primitive dlm ocf:pacemaker:controld \
meta target-role=Started
group dlm-clvm dlm clvm
clone dlm-clvm-clone dlm-clvm \
meta interleave=true ordered=true
property $id=cib-bootstrap-options \
dc-version=1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c \
cluster-infrastructure=cman \
expected-quorum-votes=2 \
stonith-enabled=false \
no-quorum-policy=ignore \
last-lrm-refresh=1363957949
rsc_defaults $id=rsc-options \
resource-stickiness=100

	With this configuration, resources are not launched, I think because 
DLM is failing because it's trying to launch dlm_controld.pcmk which is 
not installed in my system


Mar 22 13:52:57 myotis51 pengine: [2989]: notice: LogActions: Start 
dlm:0	(myotis51)
Mar 22 13:52:57 myotis51 pengine: [2989]: notice: LogActions: Leave 
dlm:1	(Stopped)
Mar 22 13:52:57 myotis51 crmd: [2990]: info: te_rsc_command: Initiating 
action 4: monitor dlm:0_monitor_0 on myotis51 (local)
Mar 22 13:52:57 myotis51 crmd: [2990]: info: do_lrm_rsc_op: Performing 
key=4:0:7:71fa2334-a3f3-4c01-a000-7e702a32d0e2 op=dlm:0_monitor_0 )

Mar 22 13:52:57 myotis51 lrmd: [2987]: info: rsc:dlm:0 probe[2] (pid 3050)
Mar 22 13:52:57 myotis51 controld[3050]: ERROR: Setup problem: couldn't 
find command: dlm_controld.pcmk
Mar 22 13:52:57 myotis51 lrmd: [2987]: info: operation monitor[2] on 
dlm:0 for client 2990: pid 3050 exited with return code 5
Mar 22 13:52:57 myotis51 crmd: [2990]: info: process_lrm_event: LRM 
operation dlm:0_monitor_0 (call=2, rc=5, cib-update=27, confirmed=true) 
not installed
Mar 22 13:52:57 myotis51 crmd: [2990]: WARN: status_from_rc: Action 4 
(dlm:0_monitor_0) on myotis51 failed (target: 7 vs. rc: 5): Error
Mar 22 13:52:57 myotis51 crmd: [2990]: info: abort_transition_graph: 
match_graph_event:277 - Triggered transition abort (complete=0, 
tag=lrm_rsc_op, id=dlm:0_last_failure_0, 
magic=0:5;4:0:7:71fa2334-a3f3-4c01-a000-7e702a32d0e2, cib=0.32.14) : 
Event failed
Mar 22 13:52:57 myotis51 crmd: [2990]: info: match_graph_event: Action 
dlm:0_monitor_0 (4) confirmed on myotis51 (rc=4)
Mar 22 13:52:57 myotis51 pengine: [2989]: notice: unpack_rsc_op: Hard 
error - dlm:0_last_failure_0 failed with rc=5: Preventing dlm-clvm-clone 
from re-starting on myotis51
Mar 22 13:52:57 myotis51 pengine: [2989]: notice: LogActions: Leave 
dlm:0	(Stopped)
Mar 22 13:52:57 myotis51 pengine: [2989]: notice: LogActions: Leave 
dlm:1	(Stopped)


	The problem with this is that I can't find any dlm_controld.pcmk binary 
for ubuntu. Any idea on how to fix this?


	The closest command I have found is dlm_controld provided with cman 
packages, but then I have to replace corosync with cman. Doing this is 
not a big problem for me. The fact is that I'm newbie in HA and the use 
of corosync instead of cman is because this is the one documented with 
pacemaker 
(http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/index.html).


	Is it corosync supposed to be better (or more open or more standard 
based) than cman? In case corosync is more recommended, then what is the 
solution for the dlm problem?


Thanks in advanced.

--
Angel L. Mateo Martínez
Sección de Telemática
Área de Tecnologías de la Información
y las Comunicaciones Aplicadas (ATICA)
http://www.um.es/atica
Tfo: 868889150
Fax: 86337

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org