Thanks Ken. Regards, Ashutosh
On Fri, Nov 10, 2017 at 6:57 AM, <users-requ...@clusterlabs.org> wrote: > Send Users mailing list submissions to > users@clusterlabs.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.clusterlabs.org/mailman/listinfo/users > or, via email, send a message with subject or body 'help' to > users-requ...@clusterlabs.org > > You can reach the person managing the list at > users-ow...@clusterlabs.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Users digest..." > > > Today's Topics: > > 1. Re: issues with pacemaker daemonization (Ken Gaillot) > 2. Re: Pacemaker 1.1.18 Release Candidate 4 (Ken Gaillot) > 3. Re: Issue in starting Pacemaker Virtual IP in RHEL 7 (Jan Pokorn?) > 4. Re: One cluster with two groups of nodes (Alberto Mijares) > 5. Pacemaker responsible of DRBD and a systemd resource > (Derek Wuelfrath) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Thu, 09 Nov 2017 09:49:20 -0600 > From: Ken Gaillot <kgail...@redhat.com> > To: Cluster Labs - All topics related to open-source clustering > welcomed <users@clusterlabs.org> > Subject: Re: [ClusterLabs] issues with pacemaker daemonization > Message-ID: <1510242560.5244.3.ca...@redhat.com> > Content-Type: text/plain; charset="UTF-8" > > On Thu, 2017-11-09 at 15:59 +0530, ashutosh tiwari wrote: > > Hi, > > > > We are observing that sometime pacemaker daemon gets the same > > processgroup id as the process /script calling the "service pacemaker > > start".? > > While child processes of pacemaeker(cib/crmd/pengine) have there > > processgroup id? same as there pid which is how things should be for > > a daemon afaik. > > > > Do we expect it to be managed by init.d (centos 6) or pacemaker > > binary. > > > > pacemaker version: pacemaker-1.1.14-8.el6_8.1.x86_64 > > > > > > Thanks and Regards, > > Ashutosh Tiwari > > When pacemakerd spawns a child (cib etc.), it calls setsid() in the > child to start a new session, which will set the process group ID and > session ID to the child's PID. > > However it doesn't do anything similar for itself. Possibly it should. > It's a longstanding to-do item to make pacemaker daemonize itself more > "properly", but no one's had the time to address it. > -- > Ken Gaillot <kgail...@redhat.com> > > > > ------------------------------ > > Message: 2 > Date: Thu, 09 Nov 2017 10:11:08 -0600 > From: Ken Gaillot <kgail...@redhat.com> > To: Kristoffer Gr?nlund <kgronl...@suse.com>, Cluster Labs - All > topics related to open-source clustering welcomed > <users@clusterlabs.org> > Subject: Re: [ClusterLabs] Pacemaker 1.1.18 Release Candidate 4 > Message-ID: <1510243868.5244.5.ca...@redhat.com> > Content-Type: text/plain; charset="UTF-8" > > On Fri, 2017-11-03 at 08:24 +0100, Kristoffer Gr?nlund wrote: > > Ken Gaillot <kgail...@redhat.com> writes: > > > > > I decided to do another release candidate, because we had a large > > > number of changes since rc3. The fourth release candidate for > > > Pacemaker > > > version 1.1.18 is now available at: > > > > > > https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1 > > > .18- > > > rc4 > > > > > > The big changes are numerous scalability improvements and bundle > > > fixes. > > > We're starting to test Pacemaker with as many as 1,500 bundles > > > (Docker > > > containers) running on 20 guest nodes running on three 56-core > > > physical > > > cluster nodes. > > > > Hi Ken, > > > > That's really cool. What's the size of the CIB with that kind of > > configuration? I guess it would compress pretty well, but still. > > The test cluster is gone now, so not sure ... Beekhof might know. > > I know it's big enough that the transition graph could get too big to > send via IPC, and we had to re-enable pengine's ability to write it to > disk instead, and have the crmd read it from disk. > > > > > Cheers, > > Kristoffer > > > > > > > > For details on the changes in this release, see the ChangeLog. > > > > > > This is likely to be the last release candidate before the final > > > release next week. Any testing you can do is very welcome. > -- > Ken Gaillot <kgail...@redhat.com> > > > > ------------------------------ > > Message: 3 > Date: Thu, 9 Nov 2017 20:18:26 +0100 > From: Jan Pokorn? <jpoko...@redhat.com> > To: users@clusterlabs.org > Subject: Re: [ClusterLabs] Issue in starting Pacemaker Virtual IP in > RHEL 7 > Message-ID: <20171109191826.gd10...@redhat.com> > Content-Type: text/plain; charset="us-ascii" > > On 06/11/17 10:43 +0000, Somanath Jeeva wrote: > > I am using a two node pacemaker cluster with teaming enabled. The > cluster has > > > > 1. Two team interfaces with different subents. > > > > 2. The team1 has a NFS VIP plumbed to it. > > > > 3. The VirtualIP from pacemaker is configured to plumb to > team0(Corosync ring number is 0) > > > > In this case the corosync takes the NFS IP as its ring address and > > checks the same in the corosync.conf. Since conf file has team0 > > hostname the corosync start fails. > > > > Outputs: > > > > > > $ip a output: > > > > [...] > > 10: team1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue > state UP qlen 1000 > > link/ether 38:63:bb:3f:a4:ad brd ff:ff:ff:ff:ff:ff > > inet 10.64.23.117/28 brd 10.64.23.127 scope global team1 > > valid_lft forever preferred_lft forever > > inet 10.64.23.121/24 scope global secondary team1:~m0 > > valid_lft forever preferred_lft forever > > inet6 fe80::3a63:bbff:fe3f:a4ad/64 scope link > > valid_lft forever preferred_lft forever > > 11: team0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue > state UP qlen 1000 > > link/ether 38:63:bb:3f:a4:ac brd ff:ff:ff:ff:ff:ff > > inet 10.64.23.103/28 brd 10.64.23.111 scope global team0 > > valid_lft forever preferred_lft forever > > inet6 fe80::3a63:bbff:fe3f:a4ac/64 scope link > > valid_lft forever preferred_lft forever > > > > Corosync Conf File: > > > > cat /etc/corosync/corosync.conf > > totem { > > version: 2 > > secauth: off > > cluster_name: DES > > transport: udp > > rrp_mode: passive > > > > interface { > > ringnumber: 0 > > bindnetaddr: 10.64.23.96 > > mcastaddr: 224.1.1.1 > > mcastport: 6860 > > } > > } > > > > nodelist { > > node { > > ring0_addr: dl380x4415 > > nodeid: 1 > > } > > > > node { > > ring0_addr: dl360x4405 > > nodeid: 2 > > } > > } > > > > quorum { > > provider: corosync_votequorum > > two_node: 1 > > } > > > > logging { > > to_logfile: yes > > logfile: /var/log/cluster/corosync.log > > to_syslog: yes > > } > > > > /etc/hosts: > > > > $ cat /etc/hosts > > [...] > > 10.64.23.103 dl380x4415 > > 10.64.23.105 dl360x4405 > > [...] > > > > Logs: > > > > [3029] dl380x4415 corosyncerror [MAIN ] Corosync Cluster Engine > exiting with status 20 at service.c:356. > > [19040] dl380x4415 corosyncnotice [MAIN ] Corosync Cluster Engine > ('2.4.0'): started and ready to provide service. > > [19040] dl380x4415 corosyncinfo [MAIN ] Corosync built-in features: > dbus systemd xmlconf qdevices qnetd snmp pie relro bindnow > > [19040] dl380x4415 corosyncnotice [TOTEM ] Initializing transport > (UDP/IP Multicast). > > [19040] dl380x4415 corosyncnotice [TOTEM ] Initializing > transmit/receive security (NSS) crypto: none hash: none > > [19040] dl380x4415 corosyncnotice [TOTEM ] The network interface > [10.64.23.121] is now up. > > [19040] dl380x4415 corosyncnotice [SERV ] Service engine loaded: > corosync configuration map access [0] > > [19040] dl380x4415 corosyncinfo [QB ] server name: cmap > > [19040] dl380x4415 corosyncnotice [SERV ] Service engine loaded: > corosync configuration service [1] > > [19040] dl380x4415 corosyncinfo [QB ] server name: cfg > > [19040] dl380x4415 corosyncnotice [SERV ] Service engine loaded: > corosync cluster closed process group service v1.01 [2] > > [19040] dl380x4415 corosyncinfo [QB ] server name: cpg > > [19040] dl380x4415 corosyncnotice [SERV ] Service engine loaded: > corosync profile loading service [4] > > [19040] dl380x4415 corosyncnotice [QUORUM] Using quorum provider > corosync_votequorum > > [19040] dl380x4415 corosynccrit [QUORUM] Quorum provider: > corosync_votequorum failed to initialize. > > [19040] dl380x4415 corosyncerror [SERV ] Service engine > 'corosync_quorum' failed to load for reason 'configuration error: nodelist > or quorum.expected_votes must be configured!' > > I suspect whether teaming is involved or not is irrelevant here. > > You are not using the latest greatest 2.4.3, so I'd suggest either the > upgrade or applying this patch (present in that version) if that helps: > > https://github.com/corosync/corosync/commit/95f9583a25007398e3792bdca2da26 > 2db18f658a > > -- > Jan (Poki) > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: not available > Type: application/pgp-signature > Size: 819 bytes > Desc: not available > URL: <http://lists.clusterlabs.org/pipermail/users/attachments/ > 20171109/3847e1e8/attachment-0001.sig> > > ------------------------------ > > Message: 4 > Date: Thu, 9 Nov 2017 17:34:35 -0400 > From: Alberto Mijares <amijar...@gmail.com> > To: Cluster Labs - All topics related to open-source clustering > welcomed <users@clusterlabs.org> > Subject: Re: [ClusterLabs] One cluster with two groups of nodes > Message-ID: > <CAGZBXN_Lv0pXUkVB_u_MWo_ZpHcFxVC3gYnS9xFYNxUZ46qTaA@ > mail.gmail.com> > Content-Type: text/plain; charset="UTF-8" > > > > > The first thing I'd mention is that a 6-node cluster can only survive > > the loss of two nodes, as 3 nodes don't have quorum. You can tweak that > > behavior with corosync quorum options, or you could add a quorum-only > > node, or use corosync's new qdevice capability to have an arbiter node. > > > > Coincidentally, I recently stumbled across a long-time Pacemaker > > feature that I wasn't aware of, that can handle this type of situation. > > It's not documented yet but will be when 1.1.18 is released soon. > > > > Colocation constraints may take a "node-attribute" parameter, that > > basically means, "Put this resource on a node of the same class as the > > one running resource X". > > > > In this case, you might set a "group" node attribute on all nodes, to > > "1" on the three primary nodes and "2" on the three failover nodes. > > Pick one resource as your base resource that everything else should go > > along with. Configure colocation constraints for all the other > > resources with that one, using "node-attribute=group". That means that > > all the other resources must be one a node with the same "group" > > attribute value as the node that the base resource is running on. > > > > "node-attribute" defaults to "#uname" (node name), this giving the > > usual behavior of colocation constraints: put the resource only on a > > node with the same name, i.e. the same node. > > > > The remaining question is, how do you want the base resource to fail > > over? If the base resource can fail over to any other node, whether in > > the same group or not, then you're done. If the base resource can only > > run on one node in each group, ban it from the other nodes using > > -INFINITY location constraints. If the base resource should only fail > > over to the opposite group, that's trickier, but something roughly > > similar would be to prefer one node in each group with an equal > > positive score location constraint, and migration-threshold=1. > > -- > > Ken Gaillot <kgail...@redhat.com> > > > Thank you very very much for this. I'm starting some tests in my lab > tonight. > > I'll let you know my results and I hope I can count on you if a get > lost in the way. > > BTW, every resource is supposed to run only on its designated node > with a group. In example: if nginx normally runs on A1 and it MUST > failover to B1. The same for every resource. > > Best regards, > > > Alberto Mijares > > > > ------------------------------ > > Message: 5 > Date: Thu, 9 Nov 2017 20:27:40 -0500 > From: Derek Wuelfrath <dwuelfr...@inverse.ca> > To: users@clusterlabs.org > Subject: [ClusterLabs] Pacemaker responsible of DRBD and a systemd > resource > Message-ID: <57ef4b1d-42a5-4b20-95c7-3a3c95f47...@inverse.ca> > Content-Type: text/plain; charset="utf-8" > > Hello there, > > First post here but following since a while! > > Here?s my issue, > we are putting in place and running this type of cluster since a while and > never really encountered this kind of problem. > > I recently set up a Corosync / Pacemaker / PCS cluster to manage DRBD > along with different other resources. Part of theses resources are some > systemd resources? this is the part where things are ?breaking?. > > Having a two servers cluster running only DRBD or DRBD with an OCF ipaddr2 > resource (Cluser IP in instance) works just fine. I can easily move from > one node to the other without any issue. > As soon as I add a systemd resource to the resource group, things are > breaking. Moving from one node to the other using standby mode works just > fine but as soon as Corosync / Pacemaker restart involves polling of a > systemd resource, it seems like it is trying to start the whole resource > group and therefore, create a split-brain of the DRBD resource. > > It is the best explanation / description of the situation that I can give. > If it need any clarification, examples, ? I am more than open to share them. > > Any guidance would be appreciated :) > > Here?s the output of a ?pcs config? > > https://pastebin.com/1TUvZ4X9 <https://pastebin.com/1TUvZ4X9> > > Cheers! > -dw > > -- > Derek Wuelfrath > dwuelfr...@inverse.ca <mailto:dwuelfr...@inverse.ca> :: +1.514.447.4918 > (x110) :: +1.866.353.6153 (x110) > Inverse inc. :: Leaders behind SOGo (www.sogo.nu <https://www.sogo.nu/>), > PacketFence (www.packetfence.org <https://www.packetfence.org/>) and > Fingerbank (www.fingerbank.org <https://www.fingerbank.org/>) > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: <http://lists.clusterlabs.org/pipermail/users/attachments/ > 20171109/9be1798b/attachment.html> > > ------------------------------ > > _______________________________________________ > Users mailing list > Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > > End of Users Digest, Vol 34, Issue 18 > ************************************* >
_______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org