Re: [Pacemaker] [Linux-HA] Announcing the Heartbeat 3.0.6 Release
On 10.02.2015 22:24, Lars Ellenberg wrote: TL;DR: If you intend to set up a new High Availability cluster using the Pacemaker cluster manager, you typically should not care for Heartbeat, but use recent releases (2.3.x) of Corosync. If you don't care for Heartbeat, don't read further. Unless you are beekhof... there's a question below ;-) After 3½ years since the last "officially tagged" release of Heartbeat, I have seen the need to do a new "maintenance release". The Heartbeat 3.0.6 release tag: 3d59540cf28d and the change set it points to: cceeb47a7d8f GREAT !!! Thank you very much, Lars! Heartbeat is still running on some our production clusters ... The main reason for this was that pacemaker more recent than somewhere between 1.1.6 and 1.1.7 would no longer work properly on the Heartbeat cluster stack. Because some of the daemons have moved from "glue" to "pacemaker" proper, and changed their paths. This has been fixed in Heartbeat. And because during that time, stonith-ng was refactored, and would still reliably fence, but not understand its own confirmation message, so it was effectively broken. This I fixed in pacemaker. If you chose to run new Pacemaker with the Heartbeat communication stack, it should be at least 1.1.12 with a few patches, see my December 2014 commits at the top of https://github.com/lge/pacemaker/commits/linbit-cluster-stack-pcmk-1.1.12 I'm not sure if they got into pacemaker upstream yet. beekhof? Do I need to rebase? Or did I miss you merging these? --- If you have those patches, consider setting this new ha.cf configuration parameter: # If pacemaker crmd spawns the pengine itself, # it sometimes "forgets" to kill the pengine on shutdown, # which later may confuse the system after cluster restart. # Tell the system that Heartbeat is supposed to # control the pengine directly. crmd_spawns_pengine off Here is the shortened Heartbeat changelog, the longer version is available in mercurial: http://hg.linux-ha.org/heartbeat-STABLE_3_0/shortlog - fix emergency shutdown due to broken update_ackseq - fix node dead detection problems - fix converging of membership (ccm) - fix init script startup glitch (caused by changes in glue/resource-agents) - heartbeat.service file for systemd platforms - new ucast6 UDP IPv6 communication plugin - package ha_api.py in standard package - update some man pages, specifically the example ha.cf - also report ccm membership status for cl_status hbstatus -v - updated some log messages, or their log levels - reduce max_delay in broadcast client_status query to one second - apply various (mostly cosmetic) patches from Debian - drop HBcompress compression plugins: they are part of cluster glue - drop "openais" HBcomm plugin - better support for current pacemaker versions - try to not miss a SIGTERM (fix problem with very fast respawn/stop cycle) - dopd: ignore dead ping nodes - cl_status improvements - api internals: reduce IPC round-trips to get at status information - uid=root is sufficient to use heartbeat api (gid=haclient remains sufficient) - fix /dev/null as log- or debugfile setting - move daemon binaries into libexecdir - document movement of compression plugins into cluster-glue - fix usage of SO_REUSEPORT in ucast sockets - fix compile issues with recent gcc and -Werror Note that a number of the mentioned "fixes" have been created two years ago already, and may have been released in packages for a long time, where vendors have chosen to package them. As to future plans for Heartbeat: Heartbeat is still useful for non-pacemaker, "haresources"-mode clusters. We (Linbit) will maintain Heartbeat for the foreseeable future. That should not be too much of a burden, as it is "stable", and due to long years of field exposure, "all bugs are known" ;-) The most notable shortcoming when using Heartbeat with Pacemaker clusters would be the limited message size. There are currently no plans to remove that limitation. With its wide choice of communications paths, even "exotic" communication plugins, and the ability to run "arbitrarily many" paths, some deployments may even favor it over Corosync still. But typically, for new deployments involving Pacemaker, in most cases you should chose Corosync 2.3.x as your membership and communication layer. For existing deployments using Heartbeat, upgrading to this Heartbeat version is strongly recommended. Thanks, Lars Ellenberg ___ Linux-HA mailing list linux...@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProble
Re: [Pacemaker] Two node cluster and no hardware device for stonith.
В Tue, 10 Feb 2015 15:58:57 +0100 Dejan Muhamedagic пишет: > On Mon, Feb 09, 2015 at 04:41:19PM +0100, Lars Ellenberg wrote: > > On Fri, Feb 06, 2015 at 04:15:44PM +0100, Dejan Muhamedagic wrote: > > > Hi, > > > > > > On Thu, Feb 05, 2015 at 09:18:50AM +0100, Digimer wrote: > > > > That is the problem that makes geo-clustering very hard to nearly > > > > impossible. You can look at the Booth option for pacemaker, but that > > > > requires two (or more) full clusters, plus an arbitrator 3rd > > > > > > A full cluster can consist of one node only. Hence, it is > > > possible to have a kind of stretch two-node [multi-site] cluster > > > based on tickets and managed by booth. > > > > In theory. > > > > In practice, we rely on "proper behaviour" of "the other site", > > in case a ticket is revoked, or cannot be renewed. > > > > Relying on a single node for "proper behaviour" does not inspire > > as much confidence as relying on a multi-node HA-cluster at each site, > > which we can expect to ensure internal fencing. > > > > With reliable hardware watchdogs, it still should be ok to do > > "stretched two node HA clusters" in a reliable way. > > > > Be generous with timeouts. > > As always. > > > And document which failure modes you expect to handle, > > and how to deal with the worst-case scenarios if you end up with some > > failure case that you are not equipped to handle properly. > > > > There are deployments which favor > > "rather online with _potential_ split brain" over > > "rather offline just in case". > > There's an arbitrator which should help in case of split brain. > You can never really differentiate between site down and site cut off due to (network) infrastructure outage. Arbitrator can mitigate split brain only to the extent you trust your network. You still have to take decision what you value more - data availability or data consistency. Long distance clusters are really for disaster recovery. It is convenient to have a single button that starts up all resources in controlled manner, but someone really need to decide to push this button. > > Document this, print it out on paper, > > > >"I am aware that this may lead to lost transactions, > >data divergence, data corruption, or data loss. > >I am personally willing to take the blame, > >and live with the consequences." > > > > Have some "boss" sign that ^^^ > > in the real world using a real pen. > > Well, of course running such a "stretch" cluster would be > rather different from a "normal" one. > > The essential thing is that there's no fencing, unless configured > as a dead-man switch for the ticket. Given that booth has a > "sanity" program hook, maybe that could be utilized to verify if > this side of the cluster is healthy enough. > > Thanks, > > Dejan > > > Lars > > > > -- > > : Lars Ellenberg > > : http://www.LINBIT.com | Your Way to High Availability > > : DRBD, Linux-HA and Pacemaker support and consulting > > > > DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. > > > > ___ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Active/Active
try to change your controld daemon OCF_ROOT=/usr/lib/ocf /usr/lib/ocf/resource.d/pacemaker/controld meta-data The daemon to start - supports gfs_controld(.pcmk) and dlm_controld(.pcmk) The daemon to start and remember you need to configure the cluster fencing, because dlm relay on it 2015-02-10 23:08 GMT+01:00 José Luis Rodríguez Rodríguez : > Hi Emmanuel, I installed this package but the result is the same when I try > to mount /dev/drbd1 on /mn: > gfs_controld join connect error: Connection refused > error mounting lockproto lock_dlm > > > I have installed gfs2-tools, dlm-pcmk y el que me indicastes gfs-pcmk > > > > My pacemaker configuration is: > node nodo1 > node nodo2 > primitive FAILOVER-ADDR ocf:heartbeat:IPaddr2 \ > params ip="192.168.122.100" nic="eth0" \ > op monitor interval="10s" meta-is-managed="true" \ > meta target-role="Started" > primitive WebData ocf:linbit:drbd \ > params drbd_resource="wwwdata" \ > op monitor interval="60s" > primitive WebFS ocf:heartbeat:Filesystem \ > params device="/dev/drbd/by-res/wwwdata" directory="/var/www" > fstype="ext4" \ > meta target-role="Stopped" > primitive WebSite ocf:heartbeat:apache \ > params configfile="/etc/apache2/apache2.conf" > statusurl="http://localhost/server-status"; \ > op monitor interval="1min" \ > meta target-role="Started" > primitive dlm ocf:pacemaker:controld \ > op monitor interval="60s" > ms WebDataClone WebData \ > meta master-max="1" master-node-max="1" clone-max="2" > clone-node-max="1" notify="true" target-role="Started" > clone dlm_clone dlm \ > meta clone-max="2" clone-node-max="1" target-role="Started" > location PREFERIDO-NODO1 WebSite 50: nodo1 > colocation WebSite-with-WebFS inf: WebSite WebFS > colocation fs_on_drbd inf: WebFS WebDataClone:Master > colocation website-with-ip inf: WebSite FAILOVER-ADDR > order WebFS-after-WebData inf: WebDataClone:promote WebFS:start > order WebSite-after-WebFS inf: WebFS WebSite > order apache-after-ip inf: FAILOVER-ADDR WebSite > property $id="cib-bootstrap-options" \ > dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="2" \ > stonith-enabled="false" \ > no-quorum-policy="ignore" > op_defaults $id="op-options" \ > timeout="240s" > > > > > Con sudo crm_mon: > > > Online: [ nodo1 nodo2 ] > > FAILOVER-ADDR (ocf::heartbeat:IPaddr2): Started nodo1 > Master/Slave Set: WebDataClone [WebData] > Masters: [ nodo1 ] > Slaves: [ nodo2 ] > Clone Set: dlm_clone [dlm] > Started: [ nodo2 nodo1 ] > > > El estado de DRBD es: > > Nodo 1 > 1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r- > ns:264908 nr:0 dw:0 dr:267236 al:0 bm:19 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f > oos:0 > > > Nodo 2 > 1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r- > ns:0 nr:264908 dw:264908 dr:0 al:0 bm:19 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f > oos:0 > > > What is my error? > > > On 10 February 2015 at 16:02, emmanuel segura wrote: >> >> I'm using debian 7 >> >> apt-cache show gfs-pcmk >> .. >> This package contains the GFS module for pacemaker. >> ... >> >> 2015-02-10 8:55 GMT+01:00 José Luis Rodríguez Rodríguez >> : >> > Hello, >> > >> > I would like to create an active/active cluster by using pacemaker and >> > corosync on Debian. I have followed the documentation >> > http://clusterlabs.org/doc/Cluster_from_Scratch.pdf. It works well >> > until >> > 8.2.2 Create and Populate an GFS2 Partition. When I try to mount the >> > disk >> > /dev/drbd1 as /mnt, the output is: >> > >> > gfs_controld join connect error: Connection refused >> > error mounting lockproto lock_dlm >> > >> > I have read that it is necessary to use cman, but then the resources >> > created >> > by pacemaker (with the command crm configure primitive ...) doesn't >> > appear >> > with the command crm_mon. >> > >> > What could I do? >> > >> > -- >> > Saludos, >> > >> > >> > José Luis >> > -- >> > Profesor Informática IES Jacarandá - Brenes (Sevilla) >> > http://www.iesjacaranda.es - www.iesjacaranda-brenes.org >> > twitter: @jlrod2 >> > >> > >> > >> > ___ >> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> > >> > Project Home: http://www.clusterlabs.org >> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> > Bugs: http://bugs.clusterlabs.org >> > >> >> >> >> -- >> esta es mi vida e me la vivo hasta que dios quiera >> >> ___ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clust
Re: [Pacemaker] Version of libqb is too old: v0.13 or greater requried
On 01/29/15 09:28, Thomas Manninger wrote: Hi, Hi David, Hi Thomas, Thanks for your help. create with checkinstall an debian package of libqb0, then it should be work. Thomas, I have created the Debian package with checkinstall, and completed the install of Pacemaker. Thanks again. Alexis. Regards *Gesendet:* Mittwoch, 28. Januar 2015 um 19:18 Uhr *Von:* "Alexis de BRUYN" *An:* pacema...@clusterlabs.org *Betreff:* [Pacemaker] Version of libqb is too old: v0.13 or greater requried Hi Everybody, I have compiled libqb 0.17.1 under Debian Jessie/testing amd64 as: tar zxvf libqb-v0.17.1.tar.gz cd libqb-0.17.1/ ./autogen.sh ./configure make -j8 make -j8 install Then after succesful builds of COROSYNC 2.3.4, CLUSTER-GLUE 1.0.12 and RESOURCE-AGENTS 3.9.5, compiling PACEMAKER 1.1.12 fails with: unzip Pacemaker-1.1.12.zip cd pacemaker-Pacemaker-1.1.12/ addgroup --system haclient ./autogen.sh ./configure [...] configure: error: in `/home/alexis/pacemaker-Pacemaker-1.1.12': configure: error: Version of libqb is too old: v0.13 or greater requried I have tried to pass some flags to ./configure, but I still get this error. What am I doing wrong? Thanks for your help, -- Alexis de BRUYN ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org -- Alexis de BRUYN ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Active/Active
Hi Emmanuel, I installed this package but the result is the same when I try to mount /dev/drbd1 on /mn: gfs_controld join connect error: Connection refused error mounting lockproto lock_dlm I have installed *gfs2-tools*, *dlm-pcmk *y el que me indicastes* gfs-pcmk* My *pacemaker configuration* is: node nodo1 node nodo2 primitive FAILOVER-ADDR ocf:heartbeat:IPaddr2 \ params ip="192.168.122.100" nic="eth0" \ op monitor interval="10s" meta-is-managed="true" \ meta target-role="Started" primitive WebData ocf:linbit:drbd \ params drbd_resource="wwwdata" \ op monitor interval="60s" primitive WebFS ocf:heartbeat:Filesystem \ params device="/dev/drbd/by-res/wwwdata" directory="/var/www" fstype="ext4" \ meta target-role="Stopped" primitive WebSite ocf:heartbeat:apache \ params configfile="/etc/apache2/apache2.conf" statusurl=" http://localhost/server-status"; \ op monitor interval="1min" \ meta target-role="Started" *primitive dlm ocf:pacemaker:controld \* *op monitor interval="60s"* ms WebDataClone WebData \ meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" target-role="Started" *clone dlm_clone dlm \* *meta clone-max="2" clone-node-max="1" target-role="Started"* location PREFERIDO-NODO1 WebSite 50: nodo1 colocation WebSite-with-WebFS inf: WebSite WebFS colocation fs_on_drbd inf: WebFS WebDataClone:Master colocation website-with-ip inf: WebSite FAILOVER-ADDR order WebFS-after-WebData inf: WebDataClone:promote WebFS:start order WebSite-after-WebFS inf: WebFS WebSite order apache-after-ip inf: FAILOVER-ADDR WebSite property $id="cib-bootstrap-options" \ dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ stonith-enabled="false" \ no-quorum-policy="ignore" op_defaults $id="op-options" \ timeout="240s" Con *sudo crm_mon*: Online: [ nodo1 nodo2 ] FAILOVER-ADDR (ocf::heartbeat:IPaddr2): Started nodo1 Master/Slave Set: WebDataClone [WebData] Masters: [ nodo1 ] Slaves: [ nodo2 ] Clone Set: dlm_clone [dlm] Started: [ nodo2 nodo1 ] El *estado de DRBD* es: Nodo 1 1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r- ns:264908 nr:0 dw:0 dr:267236 al:0 bm:19 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 Nodo 2 1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r- ns:0 nr:264908 dw:264908 dr:0 al:0 bm:19 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 What is my error? On 10 February 2015 at 16:02, emmanuel segura wrote: > I'm using debian 7 > > apt-cache show gfs-pcmk > .. > This package contains the GFS module for pacemaker. > ... > > 2015-02-10 8:55 GMT+01:00 José Luis Rodríguez Rodríguez >: > > Hello, > > > > I would like to create an active/active cluster by using pacemaker and > > corosync on Debian. I have followed the documentation > > http://clusterlabs.org/doc/Cluster_from_Scratch.pdf. It works well > until > > 8.2.2 Create and Populate an GFS2 Partition. When I try to mount the > disk > > /dev/drbd1 as /mnt, the output is: > > > > gfs_controld join connect error: Connection refused > > error mounting lockproto lock_dlm > > > > I have read that it is necessary to use cman, but then the resources > created > > by pacemaker (with the command crm configure primitive ...) doesn't > appear > > with the command crm_mon. > > > > What could I do? > > > > -- > > Saludos, > > > > > > José Luis > > -- > > Profesor Informática IES Jacarandá - Brenes (Sevilla) > > http://www.iesjacaranda.es - www.iesjacaranda-brenes.org > > twitter: @jlrod2 > > > > > > > > ___ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- Saludos, José Luis -- Profesor Informática IES Jacarandá - Brenes (Sevilla) http://www.iesjacaranda.es - www.iesjacaranda-brenes.org twitter: @jlrod2 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Announcing the Heartbeat 3.0.6 Release
TL;DR: If you intend to set up a new High Availability cluster using the Pacemaker cluster manager, you typically should not care for Heartbeat, but use recent releases (2.3.x) of Corosync. If you don't care for Heartbeat, don't read further. Unless you are beekhof... there's a question below ;-) After 3½ years since the last "officially tagged" release of Heartbeat, I have seen the need to do a new "maintenance release". The Heartbeat 3.0.6 release tag: 3d59540cf28d and the change set it points to: cceeb47a7d8f The main reason for this was that pacemaker more recent than somewhere between 1.1.6 and 1.1.7 would no longer work properly on the Heartbeat cluster stack. Because some of the daemons have moved from "glue" to "pacemaker" proper, and changed their paths. This has been fixed in Heartbeat. And because during that time, stonith-ng was refactored, and would still reliably fence, but not understand its own confirmation message, so it was effectively broken. This I fixed in pacemaker. If you chose to run new Pacemaker with the Heartbeat communication stack, it should be at least 1.1.12 with a few patches, see my December 2014 commits at the top of https://github.com/lge/pacemaker/commits/linbit-cluster-stack-pcmk-1.1.12 I'm not sure if they got into pacemaker upstream yet. beekhof? Do I need to rebase? Or did I miss you merging these? --- If you have those patches, consider setting this new ha.cf configuration parameter: # If pacemaker crmd spawns the pengine itself, # it sometimes "forgets" to kill the pengine on shutdown, # which later may confuse the system after cluster restart. # Tell the system that Heartbeat is supposed to # control the pengine directly. crmd_spawns_pengine off Here is the shortened Heartbeat changelog, the longer version is available in mercurial: http://hg.linux-ha.org/heartbeat-STABLE_3_0/shortlog - fix emergency shutdown due to broken update_ackseq - fix node dead detection problems - fix converging of membership (ccm) - fix init script startup glitch (caused by changes in glue/resource-agents) - heartbeat.service file for systemd platforms - new ucast6 UDP IPv6 communication plugin - package ha_api.py in standard package - update some man pages, specifically the example ha.cf - also report ccm membership status for cl_status hbstatus -v - updated some log messages, or their log levels - reduce max_delay in broadcast client_status query to one second - apply various (mostly cosmetic) patches from Debian - drop HBcompress compression plugins: they are part of cluster glue - drop "openais" HBcomm plugin - better support for current pacemaker versions - try to not miss a SIGTERM (fix problem with very fast respawn/stop cycle) - dopd: ignore dead ping nodes - cl_status improvements - api internals: reduce IPC round-trips to get at status information - uid=root is sufficient to use heartbeat api (gid=haclient remains sufficient) - fix /dev/null as log- or debugfile setting - move daemon binaries into libexecdir - document movement of compression plugins into cluster-glue - fix usage of SO_REUSEPORT in ucast sockets - fix compile issues with recent gcc and -Werror Note that a number of the mentioned "fixes" have been created two years ago already, and may have been released in packages for a long time, where vendors have chosen to package them. As to future plans for Heartbeat: Heartbeat is still useful for non-pacemaker, "haresources"-mode clusters. We (Linbit) will maintain Heartbeat for the foreseeable future. That should not be too much of a burden, as it is "stable", and due to long years of field exposure, "all bugs are known" ;-) The most notable shortcoming when using Heartbeat with Pacemaker clusters would be the limited message size. There are currently no plans to remove that limitation. With its wide choice of communications paths, even "exotic" communication plugins, and the ability to run "arbitrarily many" paths, some deployments may even favor it over Corosync still. But typically, for new deployments involving Pacemaker, in most cases you should chose Corosync 2.3.x as your membership and communication layer. For existing deployments using Heartbeat, upgrading to this Heartbeat version is strongly recommended. Thanks, Lars Ellenberg signature.asc Description: Digital signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] [Openais] Issues with a squid cluster.
This is really question for pacemaker list, so CCing. Regards, Honza Redeye napsal(a): > I am not certain where I should post this, hopefully someone will point me in > the right direction. > > I have a two node cluster on Ubuntu 12.04, corosync, pacemaker, and squid. > Squid is not starting at boot, pacemaker is controlling that. The two > servers are communicating just fine, pacemaker starts, stops, and monitors > the squid resources just fine too. My problem is that I am unable to do > anything with the squid instances. For example, I want to update an acl, and > I want to bounce the squid service to load the new settings. Service squid3 > stop|start|status|restart|etc does nothing, it returns unknown instance. Ps > -af |grep squid shows two instances, one user root one user proxy, and squid > is doing what it is supposed to. > > What can I do to remedy this? > ___ > Openais mailing list > open...@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/openais > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Active/Active
I'm using debian 7 apt-cache show gfs-pcmk .. This package contains the GFS module for pacemaker. ... 2015-02-10 8:55 GMT+01:00 José Luis Rodríguez Rodríguez : > Hello, > > I would like to create an active/active cluster by using pacemaker and > corosync on Debian. I have followed the documentation > http://clusterlabs.org/doc/Cluster_from_Scratch.pdf. It works well until > 8.2.2 Create and Populate an GFS2 Partition. When I try to mount the disk > /dev/drbd1 as /mnt, the output is: > > gfs_controld join connect error: Connection refused > error mounting lockproto lock_dlm > > I have read that it is necessary to use cman, but then the resources created > by pacemaker (with the command crm configure primitive ...) doesn't appear > with the command crm_mon. > > What could I do? > > -- > Saludos, > > > José Luis > -- > Profesor Informática IES Jacarandá - Brenes (Sevilla) > http://www.iesjacaranda.es - www.iesjacaranda-brenes.org > twitter: @jlrod2 > > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- esta es mi vida e me la vivo hasta que dios quiera ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Two node cluster and no hardware device for stonith.
On Mon, Feb 09, 2015 at 04:41:19PM +0100, Lars Ellenberg wrote: > On Fri, Feb 06, 2015 at 04:15:44PM +0100, Dejan Muhamedagic wrote: > > Hi, > > > > On Thu, Feb 05, 2015 at 09:18:50AM +0100, Digimer wrote: > > > That is the problem that makes geo-clustering very hard to nearly > > > impossible. You can look at the Booth option for pacemaker, but that > > > requires two (or more) full clusters, plus an arbitrator 3rd > > > > A full cluster can consist of one node only. Hence, it is > > possible to have a kind of stretch two-node [multi-site] cluster > > based on tickets and managed by booth. > > In theory. > > In practice, we rely on "proper behaviour" of "the other site", > in case a ticket is revoked, or cannot be renewed. > > Relying on a single node for "proper behaviour" does not inspire > as much confidence as relying on a multi-node HA-cluster at each site, > which we can expect to ensure internal fencing. > > With reliable hardware watchdogs, it still should be ok to do > "stretched two node HA clusters" in a reliable way. > > Be generous with timeouts. As always. > And document which failure modes you expect to handle, > and how to deal with the worst-case scenarios if you end up with some > failure case that you are not equipped to handle properly. > > There are deployments which favor > "rather online with _potential_ split brain" over > "rather offline just in case". There's an arbitrator which should help in case of split brain. > Document this, print it out on paper, > >"I am aware that this may lead to lost transactions, >data divergence, data corruption, or data loss. >I am personally willing to take the blame, >and live with the consequences." > > Have some "boss" sign that ^^^ > in the real world using a real pen. Well, of course running such a "stretch" cluster would be rather different from a "normal" one. The essential thing is that there's no fencing, unless configured as a dead-man switch for the ticket. Given that booth has a "sanity" program hook, maybe that could be utilized to verify if this side of the cluster is healthy enough. Thanks, Dejan > Lars > > -- > : Lars Ellenberg > : http://www.LINBIT.com | Your Way to High Availability > : DRBD, Linux-HA and Pacemaker support and consulting > > DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] pacemaker does not start after cman config
Hi all, was following the guide from clusterlab but use debian wheezy. corosync 1.4.2-3 pacemaker 1.1.7-1 cman 3.0.12-3.2+deb7u2 configured the active/passive with no problems but as soon as I try to config active/active with cman pacemaker doesnt start anymore it doesnt even write anything related to pacemaker in the logs, any ideas how to get a hint? Suggestions? I am following this guide: http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/ch08.html Thank you in advance! ### /etc/init.d/service.d/pcmk is removed Starting cluster: Checking Network Manager... [ OK ] Global setup... [ OK ] Loading kernel modules... [ OK ] Mounting configfs... [ OK ] Starting cman... [ OK ] Waiting for quorum... [ OK ] Starting fenced... [ OK ] Starting dlm_controld... [ OK ] Starting gfs_controld... [ OK ] Unfencing self... [ OK ] Joining fence domain... [ OK ] root@vm-2:~# cman_tool nodes Node Sts Inc Joined Name 1 M264 2015-02-06 10:09:15 vm-1.cluster.com 2 M256 2015-02-06 10:08:59 vm-2.cluster.com root@vm-2:~# /etc/init.d/pacemaker start Starting Pacemaker Cluster Manager: [FAILED] root@vm-2:/var/log/cluster# cat corosync.log Feb 06 10:43:29 corosync [MAIN ] Corosync Cluster Engine ('1.4.2'): started and ready to provide service. Feb 06 10:43:29 corosync [MAIN ] Corosync built-in features: nss Feb 06 10:43:29 corosync [MAIN ] Successfully read config from /etc/cluster/cluster.conf Feb 06 10:43:29 corosync [MAIN ] Successfully parsed cman config Feb 06 10:43:29 corosync [MAIN ] Successfully configured openais services to load Feb 06 10:43:29 corosync [TOTEM ] Token Timeout (1 ms) retransmit timeout (2380 ms) Feb 06 10:43:29 corosync [TOTEM ] token hold (1894 ms) retransmits before loss (4 retrans) Feb 06 10:43:29 corosync [TOTEM ] join (60 ms) send_join (0 ms) consensus (2 ms) merge (200 ms) Feb 06 10:43:29 corosync [TOTEM ] downcheck (1000 ms) fail to recv const (2500 msgs) Feb 06 10:43:29 corosync [TOTEM ] seqno unchanged const (30 rotations) Maximum network MTU 1402 Feb 06 10:43:29 corosync [TOTEM ] window size per rotation (50 messages) maximum messages per rotation (17 messages) Feb 06 10:43:29 corosync [TOTEM ] missed count const (5 messages) Feb 06 10:43:29 corosync [TOTEM ] send threads (0 threads) Feb 06 10:43:29 corosync [TOTEM ] RRP token expired timeout (2380 ms) Feb 06 10:43:29 corosync [TOTEM ] RRP token problem counter (2000 ms) Feb 06 10:43:29 corosync [TOTEM ] RRP threshold (10 problem count) Feb 06 10:43:29 corosync [TOTEM ] RRP multicast threshold (100 problem count) Feb 06 10:43:29 corosync [TOTEM ] RRP automatic recovery check timeout (1000 ms) Feb 06 10:43:29 corosync [TOTEM ] RRP mode set to none. Feb 06 10:43:29 corosync [TOTEM ] heartbeat_failures_allowed (0) Feb 06 10:43:29 corosync [TOTEM ] max_network_delay (50 ms) Feb 06 10:43:29 corosync [TOTEM ] HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0 Feb 06 10:43:29 corosync [TOTEM ] Initializing transport (UDP/IP Multicast). Feb 06 10:43:29 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). Feb 06 10:43:29 corosync [IPC ] you are using ipc api v2 Feb 06 10:43:29 corosync [TOTEM ] Receive multicast socket recv buffer size (262142 bytes). Feb 06 10:43:29 corosync [TOTEM ] Transmit multicast socket send buffer size (262142 bytes). Feb 06 10:43:29 corosync [TOTEM ] The network interface [192.168.1.7] is now up. Feb 06 10:43:29 corosync [TOTEM ] Created or loaded sequence id 108.192.168.1.7 for this ring. Feb 06 10:43:29 corosync [QUORUM] Using quorum provider quorum_cman Feb 06 10:43:29 corosync [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Feb 06 10:43:29 corosync [CMAN ] CMAN starting Feb 06 10:43:29 corosync [CMAN ] memb: Got node vm-1.cluster.com from ccs (id=1, votes=1) Feb 06 10:43:29 corosync [CMAN ] memb: add_new_node: vm-1.cluster.com, (id=1, votes=1) newalloc=1 Feb 06 10:43:29 corosync [CMAN ] memb: Got node vm-2.cluster.com from ccs (id=2, votes=1) Feb 06 10:43:29 corosync [CMAN ] memb: add_new_node: vm-2.cluster.com, (id=2, votes=1) newalloc=1 Feb 06 10:43:29 corosync [CMAN ] memb: add_new_node: vm-2.cluster.com, (id=2, votes=1) newalloc=0 Feb 06 10:43:29 corosync [CMAN ] CMAN 3.0.12 (built Jan 12 2013 15:20:22) started Feb 06 10:43:29 corosync [SERV ] Service engine loaded: corosync CMAN membership service 2.90 Feb 06 10:43:29 corosync [SERV ] Service engine loaded: openais cluster membership service B.01.01 Feb 06 10:43:29 corosync [EVT ] Evt exec init request Feb 06 10:43:29 corosync [SERV ] Service engine loaded: openais event service B.01.01 Feb 06 10:43:29 corosync [SERV ] Service engine loaded: openais checkpoint service B.01.01 Feb 06 10:43:29 corosync [MSG ] [DEBUG]: msg_exec_init_fn Feb 06 10:43:29 corosync [SERV ] Service engine loade
[Pacemaker] can not start pacemaker after cman config
Hi all, was following the guide from clusterlab but use debian wheezy. corosync 1.4.2-3 pacemaker 1.1.7-1 cman 3.0.12-3.2+deb7u2 configured the active/passive with no problems but as soon as I try to config active/active with cman pacemaker doesnt start anymore it doesnt even write anything related to pacemaker in the logs, any ideas how to get a hint? Suggestions? I am following this guide: http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/ch08.html Thank you in advance! ### /etc/init.d/service.d/pcmk is removed Starting cluster: Checking Network Manager... [ OK ] Global setup... [ OK ] Loading kernel modules... [ OK ] Mounting configfs... [ OK ] Starting cman... [ OK ] Waiting for quorum... [ OK ] Starting fenced... [ OK ] Starting dlm_controld... [ OK ] Starting gfs_controld... [ OK ] Unfencing self... [ OK ] Joining fence domain... [ OK ] root@vm-2:~# cman_tool nodes Node Sts Inc Joined Name 1 M264 2015-02-06 10:09:15 vm-1.cluster.com 2 M256 2015-02-06 10:08:59 vm-2.cluster.com root@vm-2:~# /etc/init.d/pacemaker start Starting Pacemaker Cluster Manager: [FAILED] root@vm-2:/var/log/cluster# cat corosync.log Feb 06 10:43:29 corosync [MAIN ] Corosync Cluster Engine ('1.4.2'): started and ready to provide service. Feb 06 10:43:29 corosync [MAIN ] Corosync built-in features: nss Feb 06 10:43:29 corosync [MAIN ] Successfully read config from /etc/cluster/cluster.conf Feb 06 10:43:29 corosync [MAIN ] Successfully parsed cman config Feb 06 10:43:29 corosync [MAIN ] Successfully configured openais services to load Feb 06 10:43:29 corosync [TOTEM ] Token Timeout (1 ms) retransmit timeout (2380 ms) Feb 06 10:43:29 corosync [TOTEM ] token hold (1894 ms) retransmits before loss (4 retrans) Feb 06 10:43:29 corosync [TOTEM ] join (60 ms) send_join (0 ms) consensus (2 ms) merge (200 ms) Feb 06 10:43:29 corosync [TOTEM ] downcheck (1000 ms) fail to recv const (2500 msgs) Feb 06 10:43:29 corosync [TOTEM ] seqno unchanged const (30 rotations) Maximum network MTU 1402 Feb 06 10:43:29 corosync [TOTEM ] window size per rotation (50 messages) maximum messages per rotation (17 messages) Feb 06 10:43:29 corosync [TOTEM ] missed count const (5 messages) Feb 06 10:43:29 corosync [TOTEM ] send threads (0 threads) Feb 06 10:43:29 corosync [TOTEM ] RRP token expired timeout (2380 ms) Feb 06 10:43:29 corosync [TOTEM ] RRP token problem counter (2000 ms) Feb 06 10:43:29 corosync [TOTEM ] RRP threshold (10 problem count) Feb 06 10:43:29 corosync [TOTEM ] RRP multicast threshold (100 problem count) Feb 06 10:43:29 corosync [TOTEM ] RRP automatic recovery check timeout (1000 ms) Feb 06 10:43:29 corosync [TOTEM ] RRP mode set to none. Feb 06 10:43:29 corosync [TOTEM ] heartbeat_failures_allowed (0) Feb 06 10:43:29 corosync [TOTEM ] max_network_delay (50 ms) Feb 06 10:43:29 corosync [TOTEM ] HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0 Feb 06 10:43:29 corosync [TOTEM ] Initializing transport (UDP/IP Multicast). Feb 06 10:43:29 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). Feb 06 10:43:29 corosync [IPC ] you are using ipc api v2 Feb 06 10:43:29 corosync [TOTEM ] Receive multicast socket recv buffer size (262142 bytes). Feb 06 10:43:29 corosync [TOTEM ] Transmit multicast socket send buffer size (262142 bytes). Feb 06 10:43:29 corosync [TOTEM ] The network interface [192.168.1.7] is now up. Feb 06 10:43:29 corosync [TOTEM ] Created or loaded sequence id 108.192.168.1.7 for this ring. Feb 06 10:43:29 corosync [QUORUM] Using quorum provider quorum_cman Feb 06 10:43:29 corosync [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Feb 06 10:43:29 corosync [CMAN ] CMAN starting Feb 06 10:43:29 corosync [CMAN ] memb: Got node vm-1.cluster.com from ccs (id=1, votes=1) Feb 06 10:43:29 corosync [CMAN ] memb: add_new_node: vm-1.cluster.com, (id=1, votes=1) newalloc=1 Feb 06 10:43:29 corosync [CMAN ] memb: Got node vm-2.cluster.com from ccs (id=2, votes=1) Feb 06 10:43:29 corosync [CMAN ] memb: add_new_node: vm-2.cluster.com, (id=2, votes=1) newalloc=1 Feb 06 10:43:29 corosync [CMAN ] memb: add_new_node: vm-2.cluster.com, (id=2, votes=1) newalloc=0 Feb 06 10:43:29 corosync [CMAN ] CMAN 3.0.12 (built Jan 12 2013 15:20:22) started Feb 06 10:43:29 corosync [SERV ] Service engine loaded: corosync CMAN membership service 2.90 Feb 06 10:43:29 corosync [SERV ] Service engine loaded: openais cluster membership service B.01.01 Feb 06 10:43:29 corosync [EVT ] Evt exec init request Feb 06 10:43:29 corosync [SERV ] Service engine loaded: openais event service B.01.01 Feb 06 10:43:29 corosync [SERV ] Service engine loaded: openais checkpoint service B.01.01 Feb 06 10:43:29 corosync [MSG ] [DEBUG]: msg_exec_init_fn Feb 06 10:43:29 corosync [SERV ] Service engine loaded
[Pacemaker] Active/Active
Hello, I would like to create an active/active cluster by using pacemaker and corosync on Debian. I have followed the documentation http://clusterlabs.org/doc/Cluster_from_Scratch.pdf. It works well until 8.2.2 Create and Populate an GFS2 Partition. When I try to mount the disk /dev/drbd1 as /mnt, the output is: *gfs_controld join connect error: Connection refused* *error mounting lockproto lock_dlm* I have read that it is necessary to use cman, but then the resources created by pacemaker (with the command crm configure primitive ...) doesn't appear with the command crm_mon. What could I do? -- Saludos, José Luis -- Profesor Informática IES Jacarandá - Brenes (Sevilla) http://www.iesjacaranda.es - www.iesjacaranda-brenes.org twitter: @jlrod2 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] why sometimes pengine seems lazy
hi: I was using pacemaker and drbd with sl linux 6.5/6.6. all are fine. now I am tesing sl linux 7.0 and I notice when I want to promote the drbd resource with "pcs resource meta my-ms-drbd master-max=2". sometimes pengine find the change immediately, but sometimes it find the change after about a minute. I don't know if the delay is normal? I didn't notice the delay when I using sl linux 6.5/6.6. the "good" result. kvm-3-ms-drbd set master-max = 2 at 13:00:07 and pengine find it at 13:00:07 Feb 10 13:00:06 [2893] love1-test.lhy.com.twcib: info: cib_process_request: Completed cib_query operation for section //constraints: OK (rc=0, origin=love2-test.lhy.com.tw/cibadmin/2, version=0.2084.3) Feb 10 13:00:06 [2893] love1-test.lhy.com.twcib: info: cib_process_request: Completed cib_query operation for section //constraints: OK (rc=0, origin=love2-test.lhy.com.tw/cibadmin/2, version=0.2084.3) Feb 10 13:00:06 [2893] love1-test.lhy.com.twcib: info: cib_process_request: Completed cib_query operation for section //constraints: OK (rc=0, origin=love2-test.lhy.com.tw/cibadmin/2, version=0.2084.3) Feb 10 13:00:07 [2893] love1-test.lhy.com.twcib: notice: cib:diff:Diff: --- 0.2084.3 Feb 10 13:00:07 [2893] love1-test.lhy.com.twcib: notice: cib:diff:Diff: +++ 0.2085.1 206a58e68f4a9cd8e72c7ebb40bef026 Feb 10 13:00:07 [2893] love1-test.lhy.com.twcib: notice: cib:diff:-- Feb 10 13:00:07 [2893] love1-test.lhy.com.twcib: notice: cib:diff:++ Feb 10 13:00:07 [2893] love1-test.lhy.com.twcib: info: cib_process_request: Completed cib_replace operation for section configuration: OK (rc=0, origin=love2-test.lhy.com.tw/cibadmin/2, version=0.2085.1) Feb 10 13:00:07 [2898] love1-test.lhy.com.tw crmd: info: abort_transition_graph: te_update_diff:126 - Triggered transition abort (complete=1, node=, tag=diff, id=(null), magic=NA, cib=0.2085.1) : Non-status change Feb 10 13:00:07 [2898] love1-test.lhy.com.tw crmd: notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ] Feb 10 13:00:07 [2893] love1-test.lhy.com.twcib: info: cib_process_request: Completed cib_query operation for section 'all': OK (rc=0, origin=local/crmd/839, version=0.2085.1) Feb 10 13:00:07 [2893] love1-test.lhy.com.twcib: info: write_cib_contents: Archived previous version as /var/lib/pacemaker/cib/cib-17.raw Feb 10 13:00:07 [2897] love1-test.lhy.com.twpengine: notice: unpack_config: On loss of CCM Quorum: Ignore Feb 10 13:00:07 [2897] love1-test.lhy.com.twpengine: info: determine_online_status: Node love2-test.lhy.com.tw is online Feb 10 13:00:07 [2897] love1-test.lhy.com.twpengine: info: determine_online_status: Node love1-test.lhy.com.tw is online Feb 10 13:00:07 [2893] love1-test.lhy.com.twcib: info: write_cib_contents: Wrote version 0.2085.0 of the CIB to disk (digest: bfdd9b0a25cde05a4b2777b6fc670519) Feb 10 13:00:07 [2897] love1-test.lhy.com.twpengine: notice: unpack_rsc_op: Operation monitor found resource kvm-6-drbd:0 active in master mode on love2-test.lhy.com.tw Feb 10 13:00:07 [2897] love1-test.lhy.com.twpengine: info: unpack_rsc_op: Operation monitor found resource kvm-6 active on love2-test.lhy.com.tw Feb 10 13:00:07 [2897] love1-test.lhy.com.twpengine: info: unpack_rsc_op: Operation monitor found resource kvm-1-drbd:1 active on love1-test.lhy.com.tw Feb 10 13:00:07 [2893] love1-test.lhy.com.twcib: info: retrieveCib: Reading cluster configuration from: /var/lib/pacemaker/cib/cib.2Mn5wa (digest: /var/lib/pacemaker/cib/cib.0nfve5) Feb 10 13:00:07 [2897] love1-test.lhy.com.twpengine: info: unpack_rsc_op: Operation monitor found resource kvm-3-drbd:1 active on love1-test.lhy.com.tw Feb 10 13:00:07 [2897] love1-test.lhy.com.twpengine: notice: unpack_rsc_op: Re-initiated expired calculated failure kvm-4_last_failure_0 (rc=7, magic=0:7;144:22:0:87034531-de2d-4395-b3c0-9bc0cecfc50e) on love1-test.lhy.com.tw Feb 10 13:00:07 [2897] love1-test.lhy.com.twpengine: info: unpack_rsc_op: Operation monitor found resource kvm-2-drbd:1 active on love1-test.lhy.com.tw Feb 10 13:00:07 [2897] love1-test.lhy.com.twpengine: info: unpack_rsc_op: Operation monitor found resource kvm-5 active on love1-test.lhy.com.tw Feb 10 13:00:07 [2897] love1-test.lhy.com.twpengine: notice: unpack_rsc_op: Operation monitor found resource kvm-5-drbd:1 active in master mode on love1-test.lhy.com.tw Feb 10 13:00:07 [2897] love1-test.lhy.com.twpengine: info: unpack_rsc_op: Operation monitor found resource kvm-6-drbd:1 active on love1-test.lhy.com.tw Feb 10 13:00:07 [2897] love1-test.lhy.com.tw