Re: [Pacemaker] Problems with SBD fencing
El 06/08/13 13:49, Jan Christian Kaldestad escribió: In my case this does not work - read my original post. So I wonder if there is a pacemaker bug (version 1.1.9-2db99f1). Killing pengine and stonithd on the node which is supposed to "shoot" seems to resolve the problem, though this is not a solution of course. I also tested two separate stonith resources, one on each node. This stonith'ing works fine with this configuration. Is there somehing "wrong" about doing it this way? For me to work (ubuntu 12.04) I had to create /etc/sysconfig/sbd file with: SBD_DEVICE="/dev/disk/by-id/wwn-0x6006016009702500a4227a04c6b0e211-part1" SBD_OPTS="-W" and the resource configuration is primitive stonith_sbd stonith:external/sbd \ params sbd_device="/dev/disk/by-id/wwn-0x6006016009702500a4227a04c6b0e211-part1" \ meta target-role="Started" Where /dev/disk/by-id/wwn-0x6006016009702500a4227a04c6b0e211-part1 is my disk device. -- Angel L. Mateo Martínez Sección de Telemática Área de Tecnologías de la Información y las Comunicaciones Aplicadas (ATICA) http://www.um.es/atica Tfo: 868889150 Fax: 86337 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Problems with SBD fencing
El 06/08/13 13:49, Jan Christian Kaldestad escribió: In my case this does not work - read my original post. So I wonder if there is a pacemaker bug (version 1.1.9-2db99f1). Killing pengine and stonithd on the node which is supposed to "shoot" seems to resolve the problem, though this is not a solution of course. I also tested two separate stonith resources, one on each node. This stonith'ing works fine with this configuration. Is there somehing "wrong" about doing it this way? Are you sure you have property stonith-enabled="true"? -- Angel L. Mateo Martínez Sección de Telemática Área de Tecnologías de la Información y las Comunicaciones Aplicadas (ATICA) http://www.um.es/atica Tfo: 868889150 Fax: 86337 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] postgresql failover
Hi Gregg Jaskiewicz 2013/8/16 Gregg Jaskiewicz : > Running rsync -avzPc -e 'ssh -o UserKnownHostsFile=/dev/null' > --delete-during 10.0.1.100:/var/lib/pgsql/9.2/data/pg_archive > /var/lib/pgsql/9.2/data/ > on each slave fixes it - but question then is - why cannot this be done > automatically by RA ? I think it's over the top to use rsync by RA. In addition, It may cause timed-out of monitor. > > Andrew on irc suggested I use restart_on_promote, but I have a feeling this > can be done without restarting anything - however the RA itself would have > to be fixed, and I can't do it myself to propose a fix, or submit a patch. I recommend using restart_on_promote parameter too. because timeline-ID is incremented when promote is called. If you use restart_on_promote="true", Saves may be able to connect new Master without rsync. Thanks, Takatoshi MATSUO ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] cibsecret not found
On 08/19/13 21:20, Халезов Иван wrote: > Hi All! > > According to crm documentation > (http://doc.opensuse.org/products/draft/SLE-HA/SLE-ha-guide_sd_draft/cha.ha.manual_config.html#sec.ha.config.crm.setpwd) > > I am trying to setup secret password parameter for my resource: > > [root@server]# crm resource secret Journal set password xx > /bin/sh: cibsecret: command not found > > I use pacemaker 1.1.9. > If there is no cibsecret command, what is the right way to store > passwords in the configuration? You have to configure it "--with-cibsecrets" when building. Regards, Gao,Yan -- Gao,Yan Software Engineer China Server Team, SUSE. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] error: qb_ipcs_us_connection_acceptor: Could not accept client connection: Too many open files (24)
- Original Message - > From: "Nikola Ciprich" > To: pacemaker@oss.clusterlabs.org > Sent: Tuesday, August 6, 2013 5:13:02 AM > Subject: [Pacemaker] error: qb_ipcs_us_connection_acceptor: Could not accept > client connection: Too many open files > (24) > > Hi, > > I'd like to ask whether somebody met similar bug: > > On one of the test two node clusters, node suddenly hung, and cib started > spawning following messages: > > error: qb_ipcs_us_connection_acceptor: Could not accept client connection: > Too many open files (24) > > in lsof, I see over thousand of opened /dev/shm files: What version of libqb do you have installed? If you can upgrade libqb -- Vossel > > cib5737 hacluster DEL REG 0,14 > 2615869 /dev/shm/qb-cib_rw-control-5737-25733-179 > cib5737 hacluster DEL REG 0,14 > 2545021 /dev/shm/qb-cib_rw-control-5737-4605-178 > cib5737 hacluster DEL REG 0,14 > 2410274 /dev/shm/qb-cib_rw-control-5737-1925-180 > cib5737 hacluster DEL REG 0,14 > 2545640 /dev/shm/qb-cib_rw-control-5737-8828-177 > cib5737 hacluster DEL REG 0,14 > 2495467 /dev/shm/qb-cib_rw-control-5737-2054-174 > cib5737 hacluster DEL REG 0,14 > 2434602 /dev/shm/qb-cib_rw-control-5737-8659-176 > > > and also sockets: > > cib5737 hacluster 1003u unix 0x880eaefee000 0t0 > 13885836 socket > cib5737 hacluster 1004u unix 0x880eada76000 0t0 > 13849634 socket > cib5737 hacluster 1005u unix 0x880eb37e7400 0t0 > 13847814 socket > cib5737 hacluster 1006u unix 0x88099c120400 0t0 > 13866356 socket > cib5737 hacluster 1007u unix 0x880eb7764000 0t0 > 13911546 socket > cib5737 hacluster 1008u unix 0x880a7f579400 0t0 > 13847938 socket > cib5737 hacluster 1009u unix 0x880a7f57e000 0t0 > 1388 socket > > OS is latest centos6 (RHEL6 clone), running x86_64 3.0.87 kernel > > another important packages: > > pacemaker-1.1.8-7.el6.x86_64 > cluster-glue-1.0.5-6.el6.x86_64 > clusterlib-3.0.12.1-49.el6.x86_64 > > Any idea on what this could be? Is this some known bug? > > with best regards > > nik > > > > -- > - > Ing. Nikola CIPRICH > LinuxBox.cz, s.r.o. > 28.rijna 168, 709 00 Ostrava > > tel.: +420 591 166 214 > fax:+420 596 621 273 > mobil: +420 777 093 799 > www.linuxbox.cz > > mobil servis: +420 737 238 656 > email servis: ser...@linuxbox.cz > - > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Dual primary drbd + ocfs2: problems starting o2cb
16.08.2013 16:04, Elmar Marschke wrote: > Hi all, > > i'm working on a two node pacemaker cluster with dual primary drbd and > ocfs2. > > Dual pri drbd and ocfs2 WITHOUT pacemaker work fine (mounting, reading, > writing, everything...). ocfs2 uses own clustering stack by default. > > When i try to make this work in pacemaker, there seems to be a problem > to start the o2cb resource. > > My (already simplified) configuration is: > - > node poc1 \ > attributes standby="off" > node poc2 \ > attributes standby="off" > primitive res_dlm ocf:pacemaker:controld \ > op monitor interval="120" > primitive res_drbd ocf:linbit:drbd \ > params drbd_resource="r0" \ > op stop interval="0" timeout="100" \ > op start interval="0" timeout="240" \ > op promote interval="0" timeout="90" \ > op demote interval="0" timeout="90" \ > op notifiy interval="0" timeout="90" \ > op monitor interval="40" role="Slave" timeout="20" \ > op monitor interval="20" role="Master" timeout="20" > primitive res_o2cb ocf:pacemaker:o2cb \ > op monitor interval="60" > ms ms_drbd res_drbd \ > meta notify="true" master-max="2" master-node-max="1" > target-role="Started" > property $id="cib-bootstrap-options" \ > no-quorum-policy="ignore" \ > stonith-enabled="false" \ > dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="2" \ > last-lrm-refresh="1376574860" Side note: you need to run both dlm and o2cb as clones, and group them (either with "group" or with pair of colocation/order statements), so so ocfs2_controld is started when dlm_controld already runs. You probably already tried that, but do not forget the last part of this. > > > First error message in corosync.log as far as i can identify it: > > lrmd: [5547]: info: RA output: (res_dlm:probe:stderr) dlm_controld.pcmk: > no process found > [ other stuff ] > lrmd: [5547]: info: RA output: (res_dlm:start:stderr) dlm_controld.pcmk: > no process found > [ other stuff ] > lrmd: [5547]: info: RA output: (res_o2cb:start:stderr) > 2013/08/16_13:25:20 ERROR: ocfs2_controld.pcmk did not come up > > ( > You can find the whole corosync logfile (starting corosync on node 1 > from beginning until after starting of resources) on: > http://www.marschke.info/corosync_drei.log > ) > > syslog shows: > - > ocfs2_controld.pcmk[5774]: Unable to connect to CKPT: Object does not exist How exactly did you start corosync process? As "corosync" or as "openais"? Background is that CKPT service is not loaded by corosync by default, only if it is started by openais script, you may want to look at it for details. > > > Output of crm_mon: > -- > > Stack: openais > Current DC: poc1 - partition WITHOUT quorum > Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff > 2 Nodes configured, 2 expected votes > 4 Resources configured. > > > Online: [ poc1 ] > OFFLINE: [ poc2 ] > > Master/Slave Set: ms_drbd [res_drbd] > Masters: [ poc1 ] > Stopped: [ res_drbd:1 ] > res_dlm(ocf::pacemaker:controld):Started poc1 > > Migration summary: > * Node poc1: >res_o2cb: migration-threshold=100 fail-count=100 > > Failed actions: > res_o2cb_start_0 (node=poc1, call=6, rc=1, status=complete): unknown > error > > - > This is the situation after a reboot of node poc1. For simplification i > left pacemaker / corosync unstarted on the second node, and already > removed a group and a clone resource where dlm and o2cb already had been > in (errors were there also). > > Is my configuration of the resource agents correct? > I checked using "ra meta ...", but as far as i recognized everything is ok. > > Is some piece of software missing? > dlm-pcmk is installed, ocfs2_controld.pcmk and dlm_controld.pcmk are > available, i even did additional links in /usr/sbin: > root@poc1:~# which ocfs2_controld.pcmk > /usr/sbin/ocfs2_controld.pcmk > root@poc1:~# which dlm_controld.pcmk > /usr/sbin/dlm_controld.pcmk > root@poc1:~# > > I already googled but couldn't find any useful. Thanks for any hints...:) > > kind regards > elmar > > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterla
Re: [Pacemaker] Dual primary drbd + ocfs2: problems starting o2cb
> -Original Message- > From: Elmar Marschke [mailto:elmar.marsc...@schenker.at] > Sent: Friday, August 16, 2013 10:31 PM > To: pacemaker@oss.clusterlabs.org > Subject: Re: [Pacemaker] Dual primary drbd + ocfs2: problems starting o2cb > > > Am 16.08.2013 15:46, schrieb Jake Smith: > >> -Original Message- > >> From: Elmar Marschke [mailto:elmar.marsc...@schenker.at] > >> Sent: Friday, August 16, 2013 9:05 AM > >> To: The Pacemaker cluster resource manager > >> Subject: [Pacemaker] Dual primary drbd + ocfs2: problems starting > >> o2cb > >> > >> Hi all, > >> > >> i'm working on a two node pacemaker cluster with dual primary drbd > >> and ocfs2. > >> > >> Dual pri drbd and ocfs2 WITHOUT pacemaker work fine (mounting, > >> reading, writing, everything...). > >> > >> When i try to make this work in pacemaker, there seems to be a > >> problem > > to > >> start the o2cb resource. > >> > >> My (already simplified) configuration is: > >> - > >> node poc1 \ > >>attributes standby="off" > >> node poc2 \ > >>attributes standby="off" > >> primitive res_dlm ocf:pacemaker:controld \ > >>op monitor interval="120" > >> primitive res_drbd ocf:linbit:drbd \ > >>params drbd_resource="r0" \ > >>op stop interval="0" timeout="100" \ > >>op start interval="0" timeout="240" \ > >>op promote interval="0" timeout="90" \ > >>op demote interval="0" timeout="90" \ > >>op notifiy interval="0" timeout="90" \ > >>op monitor interval="40" role="Slave" timeout="20" \ > >>op monitor interval="20" role="Master" timeout="20" > >> primitive res_o2cb ocf:pacemaker:o2cb \ > >>op monitor interval="60" > >> ms ms_drbd res_drbd \ > >>meta notify="true" master-max="2" master-node-max="1" target- > >> role="Started" > >> property $id="cib-bootstrap-options" \ > >>no-quorum-policy="ignore" \ > >>stonith-enabled="false" \ > >>dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \ > >>cluster-infrastructure="openais" \ > >>expected-quorum-votes="2" \ > >>last-lrm-refresh="1376574860" > >> > > > > Looks like you are missing ordering and colocation and clone (even > > group to make it a shorter config; group = order and colocation in one > > statement) statements. The resources *must* start in a particular > > order and they much run on the same node and there must be an instance > > of each resource on each node. > > > > More here for DRBD 8.4: > > http://www.drbd.org/users-guide/s-ocfs2-pacemaker.html > > Or DRBD 8.3: > > http://www.drbd.org/users-guide-8.3/s-ocfs2-pacemaker.html > > > > Basically add: > > Group grp_dlm_o2cb res_dlm res_o2cb > > Clone cl_dlm_o2cb grp_dlm_o2cb meta interleave=true Order > > ord_drbd_then_dlm_o2cb res_drbd:promote cl_dlm_o2cb:start > Colocation > > col_dlm_o2cb_with_drbdmaster cl_dlm_o2cb res_drbd:Master > > > > HTH > > > > Jake > > > > Hello Jake, > > thanks for your reply. I already had res_dlm and res_o2cb grouped together > and cloned like in your advice; indeed this was my initial configuration. But > the problem showed up, so i tried to simplify the configuration to reduce > possible error sources. > > But now it seems i found a solution; or at least a workaround: i just use the > LSB resource agent lsb:o2cb. This one works! The resource starts without a > problem on both nodes and as far as i can see right now everything is fine > (tried with and without additional group and clone resource). > > Don't know if this will bring some drawbacks in the future; but for the > moment my problem seems to be solved. Not sure either - usually resource agents are more robust than simple LSB. I would also verify that the o2cb LSB is fully LSB compliant or your cluster will have issues > > Currently it seems to me that there's a subtle problem with the > ocf:pacemaker:o2cb resource agent; at least on my system. Maybe, maybe not - if you take a look at the o2cb resource agent the error message you were getting is after trying to start /usr/sbin/ocfs2_controld.pcmk for 10 seconds without success... I would time starting o2cb. Might be as simple as allowing more time for startup of the daemon. I've not setup ocfs2 in a while but I believe you may be able to extend that timeout in the meta of the primitive without having to muck with the actual resource agent. Jake > > Anyway, thanks a lot for your answer..! > Best regards > elmar > > > > > >> First error message in corosync.log as far as i can identify it: > >> > >> lrmd: [5547]: info: RA output: (res_dlm:probe:stderr) dlm_controld.pcmk: > >> no process found > >> [ other stuff ] > >> lrmd: [5547]: info: RA output: (res_dlm:start:stderr) dlm_controld.pcmk: > >> no process found > >> [ other stuff ] > >>lrmd: [5547]: info: RA output: (res_o2cb:start:stderr) > >> 2013/08/16_13:25:20 ERROR: ocfs2_controld.pcmk did not come up > >> > >> ( > >> You can find the whole
[Pacemaker] cibsecret not found
Hi All! According to crm documentation (http://doc.opensuse.org/products/draft/SLE-HA/SLE-ha-guide_sd_draft/cha.ha.manual_config.html#sec.ha.config.crm.setpwd) I am trying to setup secret password parameter for my resource: [root@server]# crm resource secret Journal set password xx /bin/sh: cibsecret: command not found I use pacemaker 1.1.9. If there is no cibsecret command, what is the right way to store passwords in the configuration? Thank you in advance Ivan Khalezov. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] PostgreSQL failed to stop after streaming replication established
Dear community. The scenario of redundant environment is in the "graphic" representation... ++ | WAN| +v ++++++ |pgsql |pgsql ||pgsql |pgsql | ++++++ |drbd-pri |drbd-sec ||drbd-pri|drbd-sec | ++++++ | pacemaker || pacemaker | +-++--+ |corosync ||corosync | ++++++ |node1 |node2||node1 |node2 | ++++++ TC1 TC2 Within each technical center everything worked fine when migrating resources between nodes. Then I've set up streaming replication from TC1 to TC2. Now migration from one node to another failes. Pacemaker operation FAILED to stop resource postgres. However postgresql was stopped but postmaster.pid stayed corrupted. Now I ended up like this. I am unable to stop postgresql service correctly on TC1 (streaming replication master). After issuing /etc/init.d/postgresql-9.2 stop the postmaster.pid remains on the filesystem and moreover it is corrupted. I am unable to delete it with rm command. It looks like this: [root@pcmk1 ~]# ll /var/lib/pgsql/9.2/data/ ls: cannot access /var/lib/pgsql/9.2/data/postmaster.pid: No such file or directory total 56 drwx-- 7 postgres postgres62 Jun 26 17:13 base drwx-- 2 postgres postgres 4096 Aug 18 00:25 global drwx-- 2 postgres postgres17 Jun 26 09:54 pg_clog -rw--- 1 postgres postgres 5127 Aug 17 16:24 pg_hba.conf -rw--- 1 postgres postgres 1636 Jun 26 09:54 pg_ident.conf drwx-- 2 postgres postgres 4096 Jul 2 00:00 pg_log drwx-- 4 postgres postgres34 Jun 26 09:53 pg_multixact drwx-- 2 postgres postgres17 Aug 18 00:23 pg_notify drwx-- 2 postgres postgres 6 Jun 26 09:53 pg_serial drwx-- 2 postgres postgres 6 Jun 26 09:53 pg_snapshots drwx-- 2 postgres postgres 6 Aug 18 00:25 pg_stat_tmp drwx-- 2 postgres postgres17 Jun 26 09:54 pg_subtrans drwx-- 2 postgres postgres 6 Jun 26 09:53 pg_tblspc drwx-- 2 postgres postgres 6 Jun 26 09:53 pg_twophase -rw--- 1 postgres postgres 4 Jun 26 09:53 PG_VERSION drwx-- 3 postgres postgres 4096 Aug 18 00:25 pg_xlog -rw--- 1 postgres postgres 19884 Aug 17 22:54 postgresql.conf -rw--- 1 postgres postgres71 Aug 18 00:23 postmaster.opts ?? ? ???? postmaster.pid -rw-r--r-- 1 postgres postgres 491 Aug 17 16:33 recovery.done I don't know if the resource agent did something wrong while pacemaker tried stopping postgres or actually the postgres is the source component, which failed to stop correctly. What do you think? Has somebody experienced problem like this? I am using: - pacemaker-1.1.7-6 - corosync-1.4.1-7 - resource-agents-3.9.2-12 - drbd-8.4.3-2 CONFIGURATION [root@pcmk2 9.2]# crm configure show node pcmk1 \ attributes standby="off" node pcmk2 \ attributes standby="off" primitive drbd_pg ocf:linbit:drbd \ params drbd_resource="postgres" \ op monitor interval="15" role="Master" \ op monitor interval="16" role="Slave" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="120" primitive pg_fs ocf:heartbeat:Filesystem \ params device="/dev/vg_local-lv_pgsql/lv_pgsql" directory="/var/lib/pgsql/9.2/data" options="noatime,nodiratime" fstype="xfs" \ op start interval="0" timeout="60" \ op stop interval="0" timeout="120" primitive pg_lsb lsb:postgresql-9.2 \ op monitor interval="30" timeout="60" \ op start interval="0" timeout="60" \ op stop interval="0" timeout="60" primitive pg_lvm ocf:heartbeat:LVM \ params volgrpname="vg_local-lv_pgsql" \ op start interval="0" timeout="30" \ op stop interval="0" timeout="30" primitive pg_vip ocf:heartbeat:IPaddr2 \ params ip="x.x.x.x" iflabel="pcmkvip" \ op monitor interval="5" group PGServer pg_lvm pg_fs pg_lsb pg_vip \ meta target-role="Started" ms ms_drbd_pg drbd_pg \ meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" target-role="Started" location master-prefer-node1 pg_vip 50: pcmk1 colocation col_pg_drbd inf: PGServer ms_drbd
Re: [Pacemaker] Failed stop of stonith resource
19.08.2013 07:36, Andrew Beekhof wrote: > > On 14/08/2013, at 7:58 AM, Vladislav Bogdanov wrote: > >> 14.08.2013 00:51, Vladislav Bogdanov wrote: >> >> ... >> >>> >>> Sure, reason of the failure of the fence_ipmilan requires investigations >>> too, but that is not important for the above issue I think. >> >> That seems to be stonith-ng failure: > > Did you create a crm_report for this? > There's just not enough context to say anything based on these logs alone. Sent privately. > >> Aug 13 20:56:39 mgmt01 stonith-ng[10206]: notice: log_cib_diff: >> cib_process_diff: Local-only Change: 0.714.10 >> Aug 13 20:56:39 mgmt01 stonith-ng[10206]: notice: cib_process_diff: -- >> >> Aug 13 20:56:39 mgmt01 stonith-ng[10206]: notice: cib_process_diff: ++ >>> crm-debug-origin="do_state_transition" in_ccm="true" expected="member"/> >> Aug 13 20:56:39 mgmt01 stonith-ng[10206]: notice: cib_process_diff: >> Diff 0.714.10 -> 0.714.10 from local not applied to 0.714.10: + and - >> versions in the diff did not change >> Aug 13 20:56:39 mgmt01 stonith-ng[10206]: notice: update_cib_cache_cb: >> [cib_diff_notify] Patch aborted: Application of an update diff failed (-206) >> Aug 13 20:56:53 mgmt01 kernel: dlm: got connection from 10 >> Aug 13 20:57:21 mgmt01 cib[10205]: warning: cib_notify_send_one: >> Notification of client crmd/1a302bfe-5a71-4555-abbb-c030fcb6416d failed >> Aug 13 20:57:26 mgmt01 stonith-ng[10206]: notice: update_cib_cache_cb: >> [cib_diff_notify] Patch aborted: Application of an update diff failed (-206) >> Aug 13 20:57:27 mgmt01 stonith-ng[10206]: notice: update_cib_cache_cb: >> [cib_diff_notify] Patch aborted: Application of an update diff failed (-206) >> >> ...[many lines with the same message]... >> >> Aug 13 20:58:56 mgmt01 stonith-ng[10206]: notice: update_cib_cache_cb: >> [cib_diff_notify] Patch aborted: Application of an update diff failed (-206) >> Aug 13 20:58:57 mgmt01 lrmd[10207]: warning: crm_ipc_send: Request 24 >> to stonith-ng (0x19832b0) failed: Resource temporarily unavailable (-11) >> Aug 13 20:58:57 mgmt01 lrmd[10207]:error: stonith_send_command: >> Couldn't perform st_device_register operation (timeout=0s): -11: >> Connection timed out (110) >> Aug 13 20:58:57 mgmt01 stonith-ng[10206]: notice: update_cib_cache_cb: >> [cib_diff_notify] Patch aborted: Application of an update diff failed (-206) >> Aug 13 20:58:58 mgmt01 crmd[10210]:error: process_lrm_event: LRM >> operation stonith-ipmi-v03-b_start_0 (call=568, status=4, >> cib-update=128, confirmed=true) Error >> Aug 13 20:58:58 mgmt01 stonith-ng[10206]: notice: update_cib_cache_cb: >> [cib_diff_notify] Patch aborted: Application of an update diff failed (-206) >> Aug 13 20:58:58 mgmt01 attrd[10208]: notice: attrd_cs_dispatch: Update >> relayed from v03-a >> Aug 13 20:58:58 mgmt01 attrd[10208]: notice: attrd_triggeAug 13 >> 21:00:39 mgmt01 kernel: imklog 5.8.10, log source = /proc/kmsg started. >> >> >> >> ___ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org