Re: [Pacemaker] Notes on pacemaker installation on OmniOS
I am very happy that I somehow triggered this discussion :). What I did was basically just take the information that was available to me (thanks to Andreas notes and mainly his previous patches that he sent over the years) and provide a single place where one could look at and get pacemaker running on OmniOS. When I started this work I was a complete newbie on Illumos and pacemaker, so I realized that I would have saved a lot of time if some tutorial like that existed. Unfortunately, I couldn't have too much of a critical eye, as a beginner, so I ignored some things, like trying to run pacemaker compiled with the latest sources as root instead of hacluster (this was my first attempt, with old sources, and failed, so I didn't change the script again later). So, I just tried to use root as CLUSTER_USER in the SMF script and the cluster seems to run correctly, so I will update this in the post. 2014-11-14 4:02 GMT+01:00 Andrew Beekhof and...@beekhof.net: On 14 Nov 2014, at 6:54 am, Grüninger, Andreas (LGL Extern) andreas.gruenin...@lgl.bwl.de wrote: I am really sorry but I forgot the reason. It is now 2 years ago when I had problems with starting pacemaker as root. When I remember well pacemaker got always access denied when connection to corosync. With a non-root account it worked flawlessly. Oh That would be this patch: https://github.com/beekhof/pacemaker/commit/3c9275e9 I always thought there was a philosophical objection. The pull request from branch upstream3 can be closed. There is a new pull request from branch upstream4 with the changes against the current master. Excellent -Ursprüngliche Nachricht- Von: Andrew Beekhof [mailto:and...@beekhof.net] Gesendet: Donnerstag, 13. November 2014 12:11 An: The Pacemaker cluster resource manager Betreff: Re: [Pacemaker] Notes on pacemaker installation on OmniOS On 13 Nov 2014, at 9:50 pm, Grüninger, Andreas (LGL Extern) andreas.gruenin...@lgl.bwl.de wrote: I added heartbeat and corosync to have both available. Personally I use pacemaker/corosync. There is no need any more to run pacemaker as non-root with the newest version of pacemaker. I'm curious... what was the old reason? The main problems with pacemaker are the changes in the last months especially in services_linux.c. As the name implies this must be a problem with non-linux systems. What is your preferred way to handle e.g. pure linux kernel functions? Definitely to isolate them with an appropriate #define (preferably by feature availability rather than OS) I compiled a version of pacemaker yesterday but with a revision of pacemaker from august. There are pull requests waiting with patches for Solaris/Illumos. I guess it would be better to add this patches from august and my patches from yesterday to the current master. Following the patch from Vincenco I changed services_os_action_execute in services_linux.c and added for non-linux systems the synchronous wait with ppoll which is available for Solaris/BSD/MacOS. Should be same functionality as this function uses file descriptors and signal handlers. Can pull requests be rejected or redrawn? Is there anything left in them that needs to go in? If so, can you indicate which parts are needed in those pull requests please? The rest we can close - I didn't want to close them in case there was something I had missed. Andreas -Ursprüngliche Nachricht- Von: Andrew Beekhof [mailto:and...@beekhof.net] Gesendet: Donnerstag, 13. November 2014 11:13 An: The Pacemaker cluster resource manager Betreff: Re: [Pacemaker] Notes on pacemaker installation on OmniOS Interesting work... a couple of questions... - Why heartbeat and corosync? - Why the need to run pacemaker as non-root? Also, I really encourage the kinds of patches referenced in these instructions to bring them to the attention of upstream so that we can work on getting them merged. On 13 Nov 2014, at 7:09 pm, Vincenzo Pii p...@zhaw.ch wrote: Hello, I have written down my notes on the setup of pacemaker and corosync on IllumOS (OmniOS). This is just the basic setup, to be in condition of running the Dummy resource agent. It took me quite some time to get this done, so I want to share what I did assuming that this may help someone else. Here's the link: http://blog.zhaw.ch/icclab/use-pacemaker-and-corosync-on-illumos-omni o s-to-run-a-ha-activepassive-cluster/ A few things: * Maybe this setup is not optimal for how resource agents are managed by the hacluster user instead of root. This led to some problems, check this thread: https://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg20834.h t ml * I took some scripts and the general procedure from Andreas and his page here: http://grueni.github.io/libqb/. Many thanks! Regards, Vincenzo. -- Vincenzo Pii Researcher, InIT
Re: [Pacemaker] TOTEM Retransmit list in logs when a node gets up
Christine Caulfield ccaul...@redhat.com writes: [...] If its only happening at startup it could be the switch/router learning the ports for the nodes and building its routing tables. Switching to udpu will then get rid of the message if it's annoying Switching to updu make it works correctly. Thanks. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] TOTEM Retransmit list in logs when a node gets up
On 14/11/14 11:01, Daniel Dehennin wrote: Christine Caulfield ccaul...@redhat.com writes: [...] If its only happening at startup it could be the switch/router learning the ports for the nodes and building its routing tables. Switching to udpu will then get rid of the message if it's annoying Switching to updu make it works correctly. Ahh that's good. It sounds like it was something multicast related (if not exactly what I thought it might have been) ... these things usually are! Chrissie ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Long failover
Hello, We have a cluster configured via pacemaker+corosync+crm. The configuration is: node master node slave primitive HA-VIP1 IPaddr2 \ params ip=192.168.22.71 nic=bond0 \ op monitor interval=1s primitive HA-variator lsb: variator \ op monitor interval=1s \ meta migration-threshold=1 failure-timeout=1s group HA-Group HA-VIP1 HA-variator property cib-bootstrap-options: \ dc-version=1.1.10-14.el6-368c726 \ cluster-infrastructure=classic openais (with plugin) \ expected-quorum-votes=2 \ stonith-enabled=false \ no-quorum-policy=ignore \ last-lrm-refresh=1383871087 rsc_defaults rsc-options: \ resource-stickiness=100 Firstly I make the variator service down on the master node (actually I delete the service binary and kill the variator process, so the variator fails to restart). Resources very quickly move on the slave node as expected. Then I return the binary on the master and restart the variator service. Now I make the same stuff with binary and service on slave node. The crm status command quickly shows me HA-variator (lsb: variator):Stopped. But it take to much time (for us) before recourses are switched on the master node (around 1 min). Then line Failed actions: HA- variator _monitor_1000 on slave 'unknown error' (1): call=-1, status=Timed Out, last-rc-change='Sat Dec 21 03:59:45 2013', queued=0ms, exec=0ms appears in the crm status and recourses are switched. What is that timeout? Where I can change it? Kind regards, Dmitriy Matveichev. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] drbd / libvirt / Pacemaker Cluster?
Hello, i now have configured fencing in drbd: disk { fencing resource-only; } handlers { fence-peer /usr/lib/drbd/crm-fence-peer.sh; after-resync-target /usr/lib/drbd/crm-unfence-peer.sh; } And changed the config to: node $id=1084777473 master \ attributes standby=off maintenance=off node $id=1084777474 slave \ attributes maintenance=off standby=off primitive libvirt upstart:libvirt-bin \ op start timeout=120s interval=0 \ op stop timeout=120s interval=0 \ op monitor interval=30s \ meta target-role=Started primitive vmdata ocf:linbit:drbd \ params drbd_resource=vmdata \ op monitor interval=29s role=Master \ op monitor interval=31s role=Slave primitive vmdata_fs ocf:heartbeat:Filesystem \ params device=/dev/drbd0 directory=/vmdata fstype=ext4 \ meta target-role=Started ms drbd_master_slave vmdata \ meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true target-role=Started location PrimaryNode-libvirt libvirt 200: master location PrimaryNode-vmdata_fs vmdata_fs 200: master location SecondaryNode-libvirt libvirt 10: slave location SecondaryNode-vmdata_fs vmdata_fs 10: slave colocation libvirt-with-fs inf: libvirt vmdata_fs colocation services_colo inf: vmdata_fs drbd_master_slave:Master order fs_after_drbd inf: drbd_master_slave:promote vmdata_fs:start libvirt:start property $id=cib-bootstrap-options \ dc-version=1.1.10-42f2063 \ cluster-infrastructure=corosync \ stonith-enabled=false \ no-quorum-policy=ignore \ last-lrm-refresh=1415964693 But now the Cluster wont work anymore, no failover - drbd / libvirt. Both members stay always in slave state: When i try to start ressources with crm - no drbd Filesystem will mounted, but the machine ist now master - after a reboot both stays slave... Also i cant see the ressources with crm status on the shell, with the old config i can see them both?? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Long failover
On Fri, Nov 14, 2014 at 2:57 PM, Dmitry Matveichev d.matveic...@mfisoft.ru wrote: Hello, We have a cluster configured via pacemaker+corosync+crm. The configuration is: node master node slave primitive HA-VIP1 IPaddr2 \ params ip=192.168.22.71 nic=bond0 \ op monitor interval=1s primitive HA-variator lsb: variator \ op monitor interval=1s \ meta migration-threshold=1 failure-timeout=1s group HA-Group HA-VIP1 HA-variator property cib-bootstrap-options: \ dc-version=1.1.10-14.el6-368c726 \ cluster-infrastructure=classic openais (with plugin) \ expected-quorum-votes=2 \ stonith-enabled=false \ no-quorum-policy=ignore \ last-lrm-refresh=1383871087 rsc_defaults rsc-options: \ resource-stickiness=100 Firstly I make the variator service down on the master node (actually I delete the service binary and kill the variator process, so the variator fails to restart). Resources very quickly move on the slave node as expected. Then I return the binary on the master and restart the variator service. Now I make the same stuff with binary and service on slave node. The crm status command quickly shows me HA-variator (lsb: variator): Stopped. But it take to much time (for us) before recourses are switched on the master node (around 1 min). Then line Failed actions: HA- variator _monitor_1000 on slave 'unknown error' (1): call=-1, status=Timed Out, last-rc-change='Sat Dec 21 03:59:45 2013', queued=0ms, exec=0ms appears in the crm status and recourses are switched. What is that timeout? Where I can change it? This is operation timeout. You can change it in operation definition: op monitor interval=1s timeout=5s ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Long failover
We've already tried to set it but it didn't help. Kind regards, Dmitriy Matveichev. -Original Message- From: Andrei Borzenkov [mailto:arvidj...@gmail.com] Sent: Friday, November 14, 2014 4:12 PM To: The Pacemaker cluster resource manager Subject: Re: [Pacemaker] Long failover On Fri, Nov 14, 2014 at 2:57 PM, Dmitry Matveichev d.matveic...@mfisoft.ru wrote: Hello, We have a cluster configured via pacemaker+corosync+crm. The configuration is: node master node slave primitive HA-VIP1 IPaddr2 \ params ip=192.168.22.71 nic=bond0 \ op monitor interval=1s primitive HA-variator lsb: variator \ op monitor interval=1s \ meta migration-threshold=1 failure-timeout=1s group HA-Group HA-VIP1 HA-variator property cib-bootstrap-options: \ dc-version=1.1.10-14.el6-368c726 \ cluster-infrastructure=classic openais (with plugin) \ expected-quorum-votes=2 \ stonith-enabled=false \ no-quorum-policy=ignore \ last-lrm-refresh=1383871087 rsc_defaults rsc-options: \ resource-stickiness=100 Firstly I make the variator service down on the master node (actually I delete the service binary and kill the variator process, so the variator fails to restart). Resources very quickly move on the slave node as expected. Then I return the binary on the master and restart the variator service. Now I make the same stuff with binary and service on slave node. The crm status command quickly shows me HA-variator (lsb: variator): Stopped. But it take to much time (for us) before recourses are switched on the master node (around 1 min). Then line Failed actions: HA- variator _monitor_1000 on slave 'unknown error' (1): call=-1, status=Timed Out, last-rc-change='Sat Dec 21 03:59:45 2013', queued=0ms, exec=0ms appears in the crm status and recourses are switched. What is that timeout? Where I can change it? This is operation timeout. You can change it in operation definition: op monitor interval=1s timeout=5s ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Long failover
On Fri, Nov 14, 2014 at 4:33 PM, Dmitry Matveichev d.matveic...@mfisoft.ru wrote: We've already tried to set it but it didn't help. I doubt it is possible to say anything without logs. Kind regards, Dmitriy Matveichev. -Original Message- From: Andrei Borzenkov [mailto:arvidj...@gmail.com] Sent: Friday, November 14, 2014 4:12 PM To: The Pacemaker cluster resource manager Subject: Re: [Pacemaker] Long failover On Fri, Nov 14, 2014 at 2:57 PM, Dmitry Matveichev d.matveic...@mfisoft.ru wrote: Hello, We have a cluster configured via pacemaker+corosync+crm. The configuration is: node master node slave primitive HA-VIP1 IPaddr2 \ params ip=192.168.22.71 nic=bond0 \ op monitor interval=1s primitive HA-variator lsb: variator \ op monitor interval=1s \ meta migration-threshold=1 failure-timeout=1s group HA-Group HA-VIP1 HA-variator property cib-bootstrap-options: \ dc-version=1.1.10-14.el6-368c726 \ cluster-infrastructure=classic openais (with plugin) \ expected-quorum-votes=2 \ stonith-enabled=false \ no-quorum-policy=ignore \ last-lrm-refresh=1383871087 rsc_defaults rsc-options: \ resource-stickiness=100 Firstly I make the variator service down on the master node (actually I delete the service binary and kill the variator process, so the variator fails to restart). Resources very quickly move on the slave node as expected. Then I return the binary on the master and restart the variator service. Now I make the same stuff with binary and service on slave node. The crm status command quickly shows me HA-variator (lsb: variator): Stopped. But it take to much time (for us) before recourses are switched on the master node (around 1 min). Then line Failed actions: HA- variator _monitor_1000 on slave 'unknown error' (1): call=-1, status=Timed Out, last-rc-change='Sat Dec 21 03:59:45 2013', queued=0ms, exec=0ms appears in the crm status and recourses are switched. What is that timeout? Where I can change it? This is operation timeout. You can change it in operation definition: op monitor interval=1s timeout=5s ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] resource-stickiness not working?
- Original Message - Here is a simple Active/Passive configuration with a single Dummy resource (see end of message). The resource-stickiness default is set to 100. I was assuming that this would be enough to keep the Dummy resource on the active node as long as the active node stays healthy. However, stickiness is not working as I expected in the following scenario: 1) The node testnode1, which is running the Dummy resource, reboots or crashes 2) Dummy resource fails to node testnode2 3) testnode1 comes back up after reboot or crash 4) Dummy resource fails back to testnode1 I don't want the resource to failback to the original node in step 4. That is why resource-stickiness is set to 100. The only way I can get the resource to not to fail back is to set resource-stickiness to INFINITY. Is this the correct behavior of resource-stickiness? What am I missing? This is not what I understand from the documentation from clusterlabs.org. BTW, after reading various postings on fail back issues, I played with setting on-fail to standby, but that doesn't seem to help either. Any help is appreciated! I agree, this is curious. Can you attach a crm_report? Then we can walk through the transitions to figure out why this is happening. -- Vossel Scott node testnode1 node testnode2 primitive dummy ocf:heartbeat:Dummy \ op start timeout=180s interval=0 \ op stop timeout=180s interval=0 \ op monitor interval=60s timeout=60s migration-threshold=5 xml rsc_location id=cli-prefer-dummy rsc=dummy role=Started node=testnode2 score=INFINITY/ property $id=cib-bootstrap-options \ dc-version=1.1.10-14.el6-368c726 \ cluster-infrastructure=classic openais (with plugin) \ expected-quorum-votes=2 \ stonith-enabled=false \ stonith-action=reboot \ no-quorum-policy=ignore \ last-lrm-refresh=1413378119 rsc_defaults $id=rsc-options \ resource-stickiness=100 \ migration-threshold=5 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Long failover
Please find attached. Kind regards, Dmitriy Matveichev. -Original Message- From: Andrei Borzenkov [mailto:arvidj...@gmail.com] Sent: Friday, November 14, 2014 4:44 PM To: The Pacemaker cluster resource manager Subject: Re: [Pacemaker] Long failover On Fri, Nov 14, 2014 at 4:33 PM, Dmitry Matveichev d.matveic...@mfisoft.ru wrote: We've already tried to set it but it didn't help. I doubt it is possible to say anything without logs. Kind regards, Dmitriy Matveichev. -Original Message- From: Andrei Borzenkov [mailto:arvidj...@gmail.com] Sent: Friday, November 14, 2014 4:12 PM To: The Pacemaker cluster resource manager Subject: Re: [Pacemaker] Long failover On Fri, Nov 14, 2014 at 2:57 PM, Dmitry Matveichev d.matveic...@mfisoft.ru wrote: Hello, We have a cluster configured via pacemaker+corosync+crm. The configuration is: node master node slave primitive HA-VIP1 IPaddr2 \ params ip=192.168.22.71 nic=bond0 \ op monitor interval=1s primitive HA-variator lsb: variator \ op monitor interval=1s \ meta migration-threshold=1 failure-timeout=1s group HA-Group HA-VIP1 HA-variator property cib-bootstrap-options: \ dc-version=1.1.10-14.el6-368c726 \ cluster-infrastructure=classic openais (with plugin) \ expected-quorum-votes=2 \ stonith-enabled=false \ no-quorum-policy=ignore \ last-lrm-refresh=1383871087 rsc_defaults rsc-options: \ resource-stickiness=100 Firstly I make the variator service down on the master node (actually I delete the service binary and kill the variator process, so the variator fails to restart). Resources very quickly move on the slave node as expected. Then I return the binary on the master and restart the variator service. Now I make the same stuff with binary and service on slave node. The crm status command quickly shows me HA-variator (lsb: variator): Stopped. But it take to much time (for us) before recourses are switched on the master node (around 1 min). Then line Failed actions: HA- variator _monitor_1000 on slave 'unknown error' (1): call=-1, status=Timed Out, last-rc-change='Sat Dec 21 03:59:45 2013', queued=0ms, exec=0ms appears in the crm status and recourses are switched. What is that timeout? Where I can change it? This is operation timeout. You can change it in operation definition: op monitor interval=1s timeout=5s ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org log.log Description: log.log ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Operation attribute change leads to resource restart
- Original Message - Hi! Just noticed that deletion of a trace_ra op attribute forces resource to be restarted (that RA does not support reload). Logs show: Nov 13 09:06:05 [6633] node01cib: info: cib_process_request: Forwarding cib_apply_diff operation for section 'all' to master (origin=local/cibadmin/2) Nov 13 09:06:05 [6633] node01cib: info: cib_perform_op: Diff: --- 0.641.96 2 Nov 13 09:06:05 [6633] node01cib: info: cib_perform_op: Diff: +++ 0.643.0 98ecbda94c7e87250cf2262bf89f43e8 Nov 13 09:06:05 [6633] node01cib: info: cib_perform_op: -- /cib/configuration/resources/clone[@id='cl-test-instance']/primitive[@id='test-instance']/operations/op[@id='test-instance-start-0']/instance_attributes[@id='test-instance-start-0-instance_attributes'] Nov 13 09:06:05 [6633] node01cib: info: cib_perform_op: + /cib: @epoch=643, @num_updates=0 Nov 13 09:06:05 [6633] node01cib: info: cib_process_request: Completed cib_apply_diff operation for section 'all': OK (rc=0, origin=node01/cibadmin/2, version=0.643.0) Nov 13 09:06:05 [6638] node01 crmd: info: abort_transition_graph: Transition aborted by deletion of instance_attributes[@id='test-instance-start-0-instance_attributes']: Non-status change (cib=0.643.0, source=te_update_diff:383, path=/cib/configuration/resources/clone[@id='cl-test-instance']/primitive[@id='test-instance']/operations/op[@id='test-instance-start-0']/instance_attributes[@id='test-instance-start-0-instance_attributes'], 1) Nov 13 09:06:05 [6638] node01 crmd: notice: do_state_transition: State transition S_IDLE - S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph] Nov 13 09:06:05 [6634] node01 stonith-ng: info: xml_apply_patchset: v2 digest mis-match: expected 98ecbda94c7e87250cf2262bf89f43e8, calculated 0b344571f3e1bb852e3d10ca23183688 Nov 13 09:06:05 [6634] node01 stonith-ng: notice: update_cib_cache_cb: [cib_diff_notify] Patch aborted: Application of an update diff failed (-206) ... Nov 13 09:06:05 [6637] node01pengine: info: check_action_definition: params:reload parameters boot_directory=/var/lib/libvirt/boot config_uri=http://192.168.168.10:8080/cgi-bin/manage_config.cgi?action=%aamp;resource=%namp;instance=%i; start_vm=1 vlan_id_start=2 per_vlan_ip_prefix_len=24 base_img=http://192.168.168.10:8080/pre45-mguard-virt.x86_64.default.qcow2; pool_name=default outer_phy=eth0 ip_range_prefix=10.101.0.0/16/ Nov 13 09:06:05 [6637] node01pengine: info: check_action_definition: Parameters to test-instance:0_start_0 on rnode001 changed: was 6f9eb6bd1f87a2b9b542c31cf1b9c57e vs. now 02256597297dbb42aadc55d8d94e8c7f (reload:3.0.9) 0:0;41:3:0:95e66b6a-a190-4e61-83a7-47165fb0105d ... Nov 13 09:06:05 [6637] node01pengine: notice: LogActions: Restart test-instance:0 (Started rnode001) That is not what I'd expect to see. Any time an instance attribute is changed for a resource, the resource is restarted/reloaded. This is expected. -- Vossel Is it intended or just a minor bug(s)? Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org