Re: [Pacemaker] Avoid one node from being a target for resources migration

2015-01-13 Thread Tomasz Kontusz


Dmitry Koterov  napisał:
>Hello.
>
>I have 3-node cluster managed by corosync+pacemaker+crm. Node1 and
>Node2
>are DRBD master-slave, also they have a number of other services
>installed
>(postgresql, nginx, ...). Node3 is just a corosync node (for quorum),
>no
>DRBD/postgresql/... are installed at it, only corosync+pacemaker.

Quorum node can work with only corosync (and no pacemaker). It won't show up in 
crm_mon, but will affect quorum (at least in corosync 2).

>But when I add resources to the cluster, a part of them are somehow
>moved
>to node3 and since then fail. Note than I have a "colocation" directive
>to
>place these resources to the DRBD master only and "location" with -inf
>for
>node3, but this does not help - why? How to make pacemaker not run
>anything
>at node3?
>
>All the resources are added in a single transaction: "cat config.txt |
>crm
>-w -f- configure" where config.txt contains directives and "commit"
>statement at the end.
>
>Below are "crm status" (error messages) and "crm configure show"
>outputs.
>
>
>*root@node3:~# crm status*
>Current DC: node2 (1017525950) - partition with quorum
>3 Nodes configured
>6 Resources configured
>Online: [ node1 node2 node3 ]
>Master/Slave Set: ms_drbd [drbd]
> Masters: [ node1 ]
> Slaves: [ node2 ]
>Resource Group: server
> fs (ocf::heartbeat:Filesystem): Started node1
> postgresql (lsb:postgresql): Started node3 FAILED
> bind9 (lsb:bind9): Started node3 FAILED
> nginx (lsb:nginx): Started node3 (unmanaged) FAILED
>Failed actions:
>drbd_monitor_0 (node=node3, call=744, rc=5, status=complete,
>last-rc-change=Mon Jan 12 11:16:43 2015, queued=2ms, exec=0ms): not
>installed
>postgresql_monitor_0 (node=node3, call=753, rc=1, status=complete,
>last-rc-change=Mon Jan 12 11:16:43 2015, queued=8ms, exec=0ms): unknown
>error
>bind9_monitor_0 (node=node3, call=757, rc=1, status=complete,
>last-rc-change=Mon Jan 12 11:16:43 2015, queued=11ms, exec=0ms):
>unknown
>error
>nginx_stop_0 (node=node3, call=767, rc=5, status=complete,
>last-rc-change=Mon Jan 12 11:16:44 2015, queued=1ms, exec=0ms): not
>installed
>
>
>*root@node3:~# crm configure show | cat*
>node $id="1017525950" node2
>node $id="13071578" node3
>node $id="1760315215" node1
>primitive drbd ocf:linbit:drbd \
>params drbd_resource="vlv" \
>op start interval="0" timeout="240" \
>op stop interval="0" timeout="120"
>primitive fs ocf:heartbeat:Filesystem \
>params device="/dev/drbd0" directory="/var/lib/vlv.drbd/root"
>options="noatime,nodiratime" fstype="xfs" \
>op start interval="0" timeout="300" \
>op stop interval="0" timeout="300"
>primitive postgresql lsb:postgresql \
>op monitor interval="10" timeout="60" \
>op start interval="0" timeout="60" \
>op stop interval="0" timeout="60"
>primitive bind9 lsb:bind9 \
>op monitor interval="10" timeout="60" \
>op start interval="0" timeout="60" \
>op stop interval="0" timeout="60"
>primitive nginx lsb:nginx \
>op monitor interval="10" timeout="60" \
>op start interval="0" timeout="60" \
>op stop interval="0" timeout="60"
>group server fs postgresql bind9 nginx
>ms ms_drbd drbd meta master-max="1" master-node-max="1" clone-max="2"
>clone-node-max="1" notify="true"
>location loc_server server rule $id="loc_server-rule" -inf: #uname eq
>node3
>colocation col_server inf: server ms_drbd:Master
>order ord_server inf: ms_drbd:promote server:start
>property $id="cib-bootstrap-options" \
>stonith-enabled="false" \
>last-lrm-refresh="1421079189" \
>maintenance-mode="false"

It looks like you have a symmetric cluster. This makes pacemaker check each 
host for possibility of running a resource (even with -inf colocation).
You want something like this: 
http://clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/ch06s02s02.html
 (or to only run corosync on that node)

>
>
>___
>Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
>Project Home: http://www.clusterlabs.org
>Getting started:
>http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>Bugs: http://bugs.clusterlabs.org

-- 
Wysłane za pomocą K-9 Mail.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Loosing corosync communication clusterwide

2014-11-10 Thread Tomasz Kontusz
Hanging corosync sounds like libqb problems: trusty comes with 0.16, which 
likes to hang from time to time. Try building libqb 0.17.

Daniel Dehennin  napisał:
>Hello,
>
>I just have an issue on my pacemaker setup, my dlm/clvm/gfs2 was
>blocked.
>
>The “dlm_tool ls” command told me “wait ringid”.
>
>The corosync-* commands hangs (like corosync-quorumtool).
>
>The pacemaker “crm_mon” display nothing wrong.
>
>I'm using Ubuntu Trusty Tahr:
>
>- corosync 2.3.3-1ubuntu1
>- pacemaker 1.1.10+git20130802-1ubuntu2.1
>
>My cluster was manually rebooted.
>
>Any idea how to debug such situation?
>
>Regards.
>-- 
>Daniel Dehennin
>Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF
>Fingerprint: 3E69 014E 5C23 50E8 9ED6  2AAD CC1E 9E5B 7A6F E2DF
>
>
>
>
>___
>Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
>Project Home: http://www.clusterlabs.org
>Getting started:
>http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>Bugs: http://bugs.clusterlabs.org

-- 
Wysłane za pomocą K-9 Mail.___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Fencing dependency between bare metal host and its VMs guest

2014-11-10 Thread Tomasz Kontusz
I think the suggestion was to put shooting the host in the fencing path of a 
VM. This way if you can't get the host to fence the VM (as the host is already 
dead) you just check if the host was fenced.

Daniel Dehennin  napisał:
>Andrei Borzenkov  writes:
>
>
>[...]
>
>>> Now I have one issue, when the bare metal host on which the VM is
>>> running die, the VM is lost and can not be fenced.
>>> 
>>> Is there a way to make pacemaker ACK the fencing of the VM running
>on a
>>> host when the host is fenced itself?
>>> 
>>
>> Yes, you can define multiple stonith agents and priority between
>them.
>>
>> http://clusterlabs.org/wiki/Fencing_topology
>
>Hello,
>
>If I understand correctly, fencing topology is the way to have several
>fencing devices for a node and try them consecutively until one works.
>
>In my configuration, I group the VM stonith agents with the
>corresponding VM resource, to make them move together[1].
>
>Here is my use case:
>
>1. Resource ONE-Frontend-Group runs on nebula1
>2. nebula1 is fenced
>3. node one-fronted can not be fenced
>
>Is there a way to say that the life on node one-frontend is related to
>the state of resource ONE-Frontend?
>
>In which case when the node nebula1 is fenced, pacemaker should be
>aware that
>resource ONE-Frontend is not running any more, so node one-frontend is
>OFFLINE and not UNCLEAN.
>
>Regards.
>
>Footnotes: 
>[1] 
>http://oss.clusterlabs.org/pipermail/pacemaker/2014-October/022671.html
>
>-- 
>Daniel Dehennin
>Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF
>Fingerprint: 3E69 014E 5C23 50E8 9ED6  2AAD CC1E 9E5B 7A6F E2DF
>
>
>
>
>
>___
>Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
>Project Home: http://www.clusterlabs.org
>Getting started:
>http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>Bugs: http://bugs.clusterlabs.org

-- 
Wysłane za pomocą K-9 Mail.___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Resource Agents in OpenVZ containers

2014-02-16 Thread Tomasz Kontusz


Andrew Beekhof  napisał:
>
>On 16 Feb 2014, at 6:53 am, emmanuel segura  wrote:
>
>> i think, if you use pacemaker_remote inside the container, the
>container will be a normal node of you cluster, so you can run pgsql +
>vip in it
>> 
>> 
>> 2014-02-15 19:40 GMT+01:00 Tomasz Kontusz :
>> Hi
>> I'm setting up a cluster which will use OpenVZ containers for
>separating resource's environments.
>> So far I see it like this:
>>  * each node runs Pacemaker
>>  * each container runs pacemaker_remote, and one kind of resource
>(but there might be multiple containers providing the same resource)
>>  * containers are started with VirtualDomain agent (I had to patch it
>a bit to work around libvirt/OpenVZ issue),
>>each container resource is node-specific (and constrained to only
>run on the right node)
>> 
>> The problem I have is with running pgsql database with virtual IP in
>such setup.
>> I want to have IPaddr2 resource started on the node that holds
>container with current pgsql master.
>> How can I go about achieving something like that?
>
>Colocate the IP with the OpenVZ VM?

Won't work, because the containers are normally all running (psql01 on node 01, 
psql02 on node 02), and I want to collocate with current master.

>> Is the idea of using pacemaker_remote in such setup sensible?
>> 
>> --
>> Tomasz Kontusz
>> 
>> ___
>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started:
>http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>> 
>> 
>> 
>> -- 
>> esta es mi vida e me la vivo hasta que dios quiera
>> ___
>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started:
>http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
>
>
>
>___
>Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
>Project Home: http://www.clusterlabs.org
>Getting started:
>http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>Bugs: http://bugs.clusterlabs.org

-- 
Wysłane za pomocą K-9 Mail.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Resource Agents in OpenVZ containers

2014-02-15 Thread Tomasz Kontusz


emmanuel segura  napisał:
>i think, if you use pacemaker_remote inside the container, the
>container
>will be a normal node of you cluster, so you can run pgsql + vip in it
Right. I didn't want to do it like this, as the containers are not accessible 
from outside the cluster (they are in separate subnet), and i wanted to avoid 
nodes acting as routers.

A related question is how to set anti-colocation by HW node (so I'll have 
database master and apps on different nodes if possible)?


>2014-02-15 19:40 GMT+01:00 Tomasz Kontusz :
>
>> Hi
>> I'm setting up a cluster which will use OpenVZ containers for
>separating
>> resource's environments.
>> So far I see it like this:
>>  * each node runs Pacemaker
>>  * each container runs pacemaker_remote, and one kind of resource
>(but
>> there might be multiple containers providing the same resource)
>>  * containers are started with VirtualDomain agent (I had to patch it
>a
>> bit to work around libvirt/OpenVZ issue),
>>each container resource is node-specific (and constrained to only
>run
>> on the right node)
>>
>> The problem I have is with running pgsql database with virtual IP in
>such
>> setup.
>> I want to have IPaddr2 resource started on the node that holds
>container
>> with current pgsql master.
>> How can I go about achieving something like that?
>> Is the idea of using pacemaker_remote in such setup sensible?
>>
>> --
>> Tomasz Kontusz
>>
>> ___
>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started:
>http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>

-- 
Wysłane za pomocą K-9 Mail.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] Resource Agents in OpenVZ containers

2014-02-15 Thread Tomasz Kontusz

Hi
I'm setting up a cluster which will use OpenVZ containers for separating 
resource's environments.

So far I see it like this:
 * each node runs Pacemaker
 * each container runs pacemaker_remote, and one kind of resource (but 
there might be multiple containers providing the same resource)
 * containers are started with VirtualDomain agent (I had to patch it a 
bit to work around libvirt/OpenVZ issue),
   each container resource is node-specific (and constrained to only 
run on the right node)


The problem I have is with running pgsql database with virtual IP in 
such setup.
I want to have IPaddr2 resource started on the node that holds container 
with current pgsql master.

How can I go about achieving something like that?
Is the idea of using pacemaker_remote in such setup sensible?

--
Tomasz Kontusz

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org