Re: [Pacemaker] Avoid one node from being a target for resources migration
Dmitry Koterov napisał: >Hello. > >I have 3-node cluster managed by corosync+pacemaker+crm. Node1 and >Node2 >are DRBD master-slave, also they have a number of other services >installed >(postgresql, nginx, ...). Node3 is just a corosync node (for quorum), >no >DRBD/postgresql/... are installed at it, only corosync+pacemaker. Quorum node can work with only corosync (and no pacemaker). It won't show up in crm_mon, but will affect quorum (at least in corosync 2). >But when I add resources to the cluster, a part of them are somehow >moved >to node3 and since then fail. Note than I have a "colocation" directive >to >place these resources to the DRBD master only and "location" with -inf >for >node3, but this does not help - why? How to make pacemaker not run >anything >at node3? > >All the resources are added in a single transaction: "cat config.txt | >crm >-w -f- configure" where config.txt contains directives and "commit" >statement at the end. > >Below are "crm status" (error messages) and "crm configure show" >outputs. > > >*root@node3:~# crm status* >Current DC: node2 (1017525950) - partition with quorum >3 Nodes configured >6 Resources configured >Online: [ node1 node2 node3 ] >Master/Slave Set: ms_drbd [drbd] > Masters: [ node1 ] > Slaves: [ node2 ] >Resource Group: server > fs (ocf::heartbeat:Filesystem): Started node1 > postgresql (lsb:postgresql): Started node3 FAILED > bind9 (lsb:bind9): Started node3 FAILED > nginx (lsb:nginx): Started node3 (unmanaged) FAILED >Failed actions: >drbd_monitor_0 (node=node3, call=744, rc=5, status=complete, >last-rc-change=Mon Jan 12 11:16:43 2015, queued=2ms, exec=0ms): not >installed >postgresql_monitor_0 (node=node3, call=753, rc=1, status=complete, >last-rc-change=Mon Jan 12 11:16:43 2015, queued=8ms, exec=0ms): unknown >error >bind9_monitor_0 (node=node3, call=757, rc=1, status=complete, >last-rc-change=Mon Jan 12 11:16:43 2015, queued=11ms, exec=0ms): >unknown >error >nginx_stop_0 (node=node3, call=767, rc=5, status=complete, >last-rc-change=Mon Jan 12 11:16:44 2015, queued=1ms, exec=0ms): not >installed > > >*root@node3:~# crm configure show | cat* >node $id="1017525950" node2 >node $id="13071578" node3 >node $id="1760315215" node1 >primitive drbd ocf:linbit:drbd \ >params drbd_resource="vlv" \ >op start interval="0" timeout="240" \ >op stop interval="0" timeout="120" >primitive fs ocf:heartbeat:Filesystem \ >params device="/dev/drbd0" directory="/var/lib/vlv.drbd/root" >options="noatime,nodiratime" fstype="xfs" \ >op start interval="0" timeout="300" \ >op stop interval="0" timeout="300" >primitive postgresql lsb:postgresql \ >op monitor interval="10" timeout="60" \ >op start interval="0" timeout="60" \ >op stop interval="0" timeout="60" >primitive bind9 lsb:bind9 \ >op monitor interval="10" timeout="60" \ >op start interval="0" timeout="60" \ >op stop interval="0" timeout="60" >primitive nginx lsb:nginx \ >op monitor interval="10" timeout="60" \ >op start interval="0" timeout="60" \ >op stop interval="0" timeout="60" >group server fs postgresql bind9 nginx >ms ms_drbd drbd meta master-max="1" master-node-max="1" clone-max="2" >clone-node-max="1" notify="true" >location loc_server server rule $id="loc_server-rule" -inf: #uname eq >node3 >colocation col_server inf: server ms_drbd:Master >order ord_server inf: ms_drbd:promote server:start >property $id="cib-bootstrap-options" \ >stonith-enabled="false" \ >last-lrm-refresh="1421079189" \ >maintenance-mode="false" It looks like you have a symmetric cluster. This makes pacemaker check each host for possibility of running a resource (even with -inf colocation). You want something like this: http://clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/ch06s02s02.html (or to only run corosync on that node) > > >___ >Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >Project Home: http://www.clusterlabs.org >Getting started: >http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >Bugs: http://bugs.clusterlabs.org -- Wysłane za pomocą K-9 Mail. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Loosing corosync communication clusterwide
Hanging corosync sounds like libqb problems: trusty comes with 0.16, which likes to hang from time to time. Try building libqb 0.17. Daniel Dehennin napisał: >Hello, > >I just have an issue on my pacemaker setup, my dlm/clvm/gfs2 was >blocked. > >The “dlm_tool ls” command told me “wait ringid”. > >The corosync-* commands hangs (like corosync-quorumtool). > >The pacemaker “crm_mon” display nothing wrong. > >I'm using Ubuntu Trusty Tahr: > >- corosync 2.3.3-1ubuntu1 >- pacemaker 1.1.10+git20130802-1ubuntu2.1 > >My cluster was manually rebooted. > >Any idea how to debug such situation? > >Regards. >-- >Daniel Dehennin >Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF >Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF > > > > >___ >Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >Project Home: http://www.clusterlabs.org >Getting started: >http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >Bugs: http://bugs.clusterlabs.org -- Wysłane za pomocą K-9 Mail.___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Fencing dependency between bare metal host and its VMs guest
I think the suggestion was to put shooting the host in the fencing path of a VM. This way if you can't get the host to fence the VM (as the host is already dead) you just check if the host was fenced. Daniel Dehennin napisał: >Andrei Borzenkov writes: > > >[...] > >>> Now I have one issue, when the bare metal host on which the VM is >>> running die, the VM is lost and can not be fenced. >>> >>> Is there a way to make pacemaker ACK the fencing of the VM running >on a >>> host when the host is fenced itself? >>> >> >> Yes, you can define multiple stonith agents and priority between >them. >> >> http://clusterlabs.org/wiki/Fencing_topology > >Hello, > >If I understand correctly, fencing topology is the way to have several >fencing devices for a node and try them consecutively until one works. > >In my configuration, I group the VM stonith agents with the >corresponding VM resource, to make them move together[1]. > >Here is my use case: > >1. Resource ONE-Frontend-Group runs on nebula1 >2. nebula1 is fenced >3. node one-fronted can not be fenced > >Is there a way to say that the life on node one-frontend is related to >the state of resource ONE-Frontend? > >In which case when the node nebula1 is fenced, pacemaker should be >aware that >resource ONE-Frontend is not running any more, so node one-frontend is >OFFLINE and not UNCLEAN. > >Regards. > >Footnotes: >[1] >http://oss.clusterlabs.org/pipermail/pacemaker/2014-October/022671.html > >-- >Daniel Dehennin >Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF >Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF > > > > > >___ >Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >Project Home: http://www.clusterlabs.org >Getting started: >http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >Bugs: http://bugs.clusterlabs.org -- Wysłane za pomocą K-9 Mail.___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Resource Agents in OpenVZ containers
Andrew Beekhof napisał: > >On 16 Feb 2014, at 6:53 am, emmanuel segura wrote: > >> i think, if you use pacemaker_remote inside the container, the >container will be a normal node of you cluster, so you can run pgsql + >vip in it >> >> >> 2014-02-15 19:40 GMT+01:00 Tomasz Kontusz : >> Hi >> I'm setting up a cluster which will use OpenVZ containers for >separating resource's environments. >> So far I see it like this: >> * each node runs Pacemaker >> * each container runs pacemaker_remote, and one kind of resource >(but there might be multiple containers providing the same resource) >> * containers are started with VirtualDomain agent (I had to patch it >a bit to work around libvirt/OpenVZ issue), >>each container resource is node-specific (and constrained to only >run on the right node) >> >> The problem I have is with running pgsql database with virtual IP in >such setup. >> I want to have IPaddr2 resource started on the node that holds >container with current pgsql master. >> How can I go about achieving something like that? > >Colocate the IP with the OpenVZ VM? Won't work, because the containers are normally all running (psql01 on node 01, psql02 on node 02), and I want to collocate with current master. >> Is the idea of using pacemaker_remote in such setup sensible? >> >> -- >> Tomasz Kontusz >> >> ___ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: >http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> >> >> >> -- >> esta es mi vida e me la vivo hasta que dios quiera >> ___ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: >http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > > > > >___ >Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >Project Home: http://www.clusterlabs.org >Getting started: >http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >Bugs: http://bugs.clusterlabs.org -- Wysłane za pomocą K-9 Mail. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Resource Agents in OpenVZ containers
emmanuel segura napisał: >i think, if you use pacemaker_remote inside the container, the >container >will be a normal node of you cluster, so you can run pgsql + vip in it Right. I didn't want to do it like this, as the containers are not accessible from outside the cluster (they are in separate subnet), and i wanted to avoid nodes acting as routers. A related question is how to set anti-colocation by HW node (so I'll have database master and apps on different nodes if possible)? >2014-02-15 19:40 GMT+01:00 Tomasz Kontusz : > >> Hi >> I'm setting up a cluster which will use OpenVZ containers for >separating >> resource's environments. >> So far I see it like this: >> * each node runs Pacemaker >> * each container runs pacemaker_remote, and one kind of resource >(but >> there might be multiple containers providing the same resource) >> * containers are started with VirtualDomain agent (I had to patch it >a >> bit to work around libvirt/OpenVZ issue), >>each container resource is node-specific (and constrained to only >run >> on the right node) >> >> The problem I have is with running pgsql database with virtual IP in >such >> setup. >> I want to have IPaddr2 resource started on the node that holds >container >> with current pgsql master. >> How can I go about achieving something like that? >> Is the idea of using pacemaker_remote in such setup sensible? >> >> -- >> Tomasz Kontusz >> >> ___ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: >http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> -- Wysłane za pomocą K-9 Mail. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Resource Agents in OpenVZ containers
Hi I'm setting up a cluster which will use OpenVZ containers for separating resource's environments. So far I see it like this: * each node runs Pacemaker * each container runs pacemaker_remote, and one kind of resource (but there might be multiple containers providing the same resource) * containers are started with VirtualDomain agent (I had to patch it a bit to work around libvirt/OpenVZ issue), each container resource is node-specific (and constrained to only run on the right node) The problem I have is with running pgsql database with virtual IP in such setup. I want to have IPaddr2 resource started on the node that holds container with current pgsql master. How can I go about achieving something like that? Is the idea of using pacemaker_remote in such setup sensible? -- Tomasz Kontusz ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org