[Pacemaker] Notes on pacemaker installation on OmniOS
Hello, I have written down my notes on the setup of pacemaker and corosync on IllumOS (OmniOS). This is just the basic setup, to be in condition of running the Dummy resource agent. It took me quite some time to get this done, so I want to share what I did assuming that this may help someone else. Here's the link: http://blog.zhaw.ch/icclab/use-pacemaker-and-corosync-on-illumos-omnios-to-run-a-ha-activepassive-cluster/ A few things: * Maybe this setup is not optimal for how resource agents are managed by the hacluster user instead of root. This led to some problems, check this thread: https://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg20834.html * I took some scripts and the general procedure from Andreas and his page here: http://grueni.github.io/libqb/. Many thanks! Regards, Vincenzo. -- Vincenzo Pii Researcher, InIT Cloud Computing Lab Zurich University of Applied Sciences (ZHAW) blog.zhaw.ch/icclab ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Notes on pacemaker installation on OmniOS
I added heartbeat and corosync to have both available. Personally I use pacemaker/corosync. There is no need any more to run pacemaker as non-root with the newest version of pacemaker. The main problems with pacemaker are the changes in the last months especially in services_linux.c. As the name implies this must be a problem with non-linux systems. What is your preferred way to handle e.g. pure linux kernel functions? I compiled a version of pacemaker yesterday but with a revision of pacemaker from august. There are pull requests waiting with patches for Solaris/Illumos. I guess it would be better to add this patches from august and my patches from yesterday to the current master. Following the patch from Vincenco I changed services_os_action_execute in services_linux.c and added for non-linux systems the synchronous wait with ppoll which is available for Solaris/BSD/MacOS. Should be same functionality as this function uses file descriptors and signal handlers. Can pull requests be rejected or redrawn? Andreas -Ursprüngliche Nachricht- Von: Andrew Beekhof [mailto:and...@beekhof.net] Gesendet: Donnerstag, 13. November 2014 11:13 An: The Pacemaker cluster resource manager Betreff: Re: [Pacemaker] Notes on pacemaker installation on OmniOS Interesting work... a couple of questions... - Why heartbeat and corosync? - Why the need to run pacemaker as non-root? Also, I really encourage the kinds of patches referenced in these instructions to bring them to the attention of upstream so that we can work on getting them merged. On 13 Nov 2014, at 7:09 pm, Vincenzo Pii p...@zhaw.ch wrote: Hello, I have written down my notes on the setup of pacemaker and corosync on IllumOS (OmniOS). This is just the basic setup, to be in condition of running the Dummy resource agent. It took me quite some time to get this done, so I want to share what I did assuming that this may help someone else. Here's the link: http://blog.zhaw.ch/icclab/use-pacemaker-and-corosync-on-illumos-omnio s-to-run-a-ha-activepassive-cluster/ A few things: * Maybe this setup is not optimal for how resource agents are managed by the hacluster user instead of root. This led to some problems, check this thread: https://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg20834.ht ml * I took some scripts and the general procedure from Andreas and his page here: http://grueni.github.io/libqb/. Many thanks! Regards, Vincenzo. -- Vincenzo Pii Researcher, InIT Cloud Computing Lab Zurich University of Applied Sciences (ZHAW) blog.zhaw.ch/icclab ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Reset failcount for resources
Hi I am running a 2 node cluster with this config Master/Slave Set: foo-master [foo] Masters: [ bharat ] Slaves: [ ram ] AC_FLT (ocf::pw:IPaddr): Started bharat CR_CP_FLT (ocf::pw:IPaddr): Started bharat CR_UP_FLT (ocf::pw:IPaddr): Started bharat Mgmt_FLT (ocf::pw:IPaddr): Started bharat where IPaddr RA is just modified IPAddr2 RA. Additionally i have a collocation constraint for the IP addr to be collocated with the master. I have set the migration-threshold as 2 for the VIP. I also have set the failure-timeout to 15s. Initially i bring down the interface on bharat to force switch-over to ram. After this i fail the interfaces on bharat again. Now i bring the interface up again on ram. However the virtual IP's are now in stopped state. I don't get out of this unless i use crm_resource -C to reset state of resources. However if i check failcount of resources after this it's still set as INFINITY. Based on the documentation the failcount on a node should have expired after the failure-timeout.That doesn't happen. However why don't we reset the count after the the crm_resource -C command too. Any other command to actually reset the failcount. Thanks in advance Regards Arjun ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Notes on pacemaker installation on OmniOS
On 13 Nov 2014, at 9:50 pm, Grüninger, Andreas (LGL Extern) andreas.gruenin...@lgl.bwl.de wrote: I added heartbeat and corosync to have both available. Personally I use pacemaker/corosync. There is no need any more to run pacemaker as non-root with the newest version of pacemaker. I'm curious... what was the old reason? The main problems with pacemaker are the changes in the last months especially in services_linux.c. As the name implies this must be a problem with non-linux systems. What is your preferred way to handle e.g. pure linux kernel functions? Definitely to isolate them with an appropriate #define (preferably by feature availability rather than OS) I compiled a version of pacemaker yesterday but with a revision of pacemaker from august. There are pull requests waiting with patches for Solaris/Illumos. I guess it would be better to add this patches from august and my patches from yesterday to the current master. Following the patch from Vincenco I changed services_os_action_execute in services_linux.c and added for non-linux systems the synchronous wait with ppoll which is available for Solaris/BSD/MacOS. Should be same functionality as this function uses file descriptors and signal handlers. Can pull requests be rejected or redrawn? Is there anything left in them that needs to go in? If so, can you indicate which parts are needed in those pull requests please? The rest we can close - I didn't want to close them in case there was something I had missed. Andreas -Ursprüngliche Nachricht- Von: Andrew Beekhof [mailto:and...@beekhof.net] Gesendet: Donnerstag, 13. November 2014 11:13 An: The Pacemaker cluster resource manager Betreff: Re: [Pacemaker] Notes on pacemaker installation on OmniOS Interesting work... a couple of questions... - Why heartbeat and corosync? - Why the need to run pacemaker as non-root? Also, I really encourage the kinds of patches referenced in these instructions to bring them to the attention of upstream so that we can work on getting them merged. On 13 Nov 2014, at 7:09 pm, Vincenzo Pii p...@zhaw.ch wrote: Hello, I have written down my notes on the setup of pacemaker and corosync on IllumOS (OmniOS). This is just the basic setup, to be in condition of running the Dummy resource agent. It took me quite some time to get this done, so I want to share what I did assuming that this may help someone else. Here's the link: http://blog.zhaw.ch/icclab/use-pacemaker-and-corosync-on-illumos-omnio s-to-run-a-ha-activepassive-cluster/ A few things: * Maybe this setup is not optimal for how resource agents are managed by the hacluster user instead of root. This led to some problems, check this thread: https://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg20834.ht ml * I took some scripts and the general procedure from Andreas and his page here: http://grueni.github.io/libqb/. Many thanks! Regards, Vincenzo. -- Vincenzo Pii Researcher, InIT Cloud Computing Lab Zurich University of Applied Sciences (ZHAW) blog.zhaw.ch/icclab ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] drbd / libvirt / Pacemaker Cluster?
Hello, i need an Cluster with drbd, the active Cluster Member should hold a running kvm instance, started via libvirt. A virtual IP is not needet. It runs, but from time to Time it doesnt take over correctly when i reboot the master System, normaly all resources after the machine is up again should migrate back to the master System (via location statement). In the most cases this works, but from time to time drbd failed and the ressources stay on the slave Server, after rebooting the master Server one time more, all is OK. What i later still need ist a automatic drbd Split Brain recovery, if anyone have a working config for this it should be interesting to see it. Here is my pacemaker configuration: node $id=1084777473 master \ attributes standby=off maintenance=off node $id=1084777474 slave \ attributes maintenance=off standby=off primitive libvirt upstart:libvirt-bin \ op start timeout=120s interval=0 \ op stop timeout=120s interval=0 \ op monitor interval=30s \ meta target-role=Started primitive vmdata ocf:linbit:drbd \ params drbd_resource=vmdata \ op monitor interval=29s role=Master \ op monitor interval=31s role=Slave primitive vmdata_fs ocf:heartbeat:Filesystem \ params device=/dev/drbd0 directory=/vmdata fstype=ext4 \ meta target-role=Started ms drbd_master_slave vmdata \ meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true location PrimaryNode-libvirt libvirt 200: master location PrimaryNode-vmdata_fs vmdata_fs 200: master location SecondaryNode-libvirt libvirt 10: slave location SecondaryNode-vmdata_fs vmdata_fs 10: slave colocation services_colo inf: drbd_master_slave:Master vmdata_fs order fs_after_drbd inf: drbd_master_slave:promote vmdata_fs:start libvirt:start property $id=cib-bootstrap-options \ dc-version=1.1.10-42f2063 \ cluster-infrastructure=corosync \ stonith-enabled=false \ no-quorum-policy=ignore \ last-lrm-refresh=1415619869 There must be an Error in this configuration, but i dont know in which part. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] TOTEM Retransmit list in logs when a node gets up
Hello, My cluster seems to works correctly but when I start corosync and pacemaker on one of them[1] I start to have some TOTEM logs like this: #+begin_src Nov 13 14:00:10 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4e 4f Nov 13 14:00:10 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 46 47 48 49 4a 4b 4c 4d 4e 4f Nov 13 14:00:10 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4b 4c 4d 4e 4f Nov 13 14:00:30 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 47 48 49 4a 4b 4c 4d 4e 4f Nov 13 14:00:30 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 47 48 49 4a 4b 4c 4d 4e 4f Nov 13 14:00:30 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4e 4f Nov 13 14:00:30 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4e 4f Nov 13 14:00:30 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4b 4c 4d 4e 4f Nov 13 14:00:30 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4e 4f Nov 13 14:00:35 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4c 4d 4e 4f Nov 13 14:00:35 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4c 4d 4e 4f Nov 13 14:00:35 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4d 4e 4f Nov 13 14:00:35 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 47 48 49 4a 4b 4c 4d 4e 4f Nov 13 14:00:35 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4a 4b 4c 4d 4e 4f Nov 13 14:00:35 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4b 4c 4d 4e 4f #+end_src I do not understand what happens, do you have any hints? Regards. Footnotes: [1] the VM using two cards http://oss.clusterlabs.org/pipermail/pacemaker/2014-November/022962.html -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] drbd / libvirt / Pacemaker Cluster?
Hi, On Thu, Nov 13, 2014 at 01:57:08PM +0100, Heiner Meier wrote: Hello, i need an Cluster with drbd, the active Cluster Member should hold a running kvm instance, started via libvirt. A virtual IP is not needet. It runs, but from time to Time it doesnt take over correctly when i reboot the master System, normaly all resources after the machine is up again should migrate back to the master System (via location statement). In the most cases this works, but from time to time drbd failed and the ressources stay on the slave Server, after rebooting the master Server one time more, all is OK. What i later still need ist a automatic drbd Split Brain recovery, if anyone have a working config for this it should be interesting to see it. Here is my pacemaker configuration: node $id=1084777473 master \ attributes standby=off maintenance=off node $id=1084777474 slave \ attributes maintenance=off standby=off primitive libvirt upstart:libvirt-bin \ op start timeout=120s interval=0 \ op stop timeout=120s interval=0 \ op monitor interval=30s \ meta target-role=Started primitive vmdata ocf:linbit:drbd \ params drbd_resource=vmdata \ op monitor interval=29s role=Master \ op monitor interval=31s role=Slave primitive vmdata_fs ocf:heartbeat:Filesystem \ params device=/dev/drbd0 directory=/vmdata fstype=ext4 \ meta target-role=Started ms drbd_master_slave vmdata \ meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true location PrimaryNode-libvirt libvirt 200: master location PrimaryNode-vmdata_fs vmdata_fs 200: master location SecondaryNode-libvirt libvirt 10: slave location SecondaryNode-vmdata_fs vmdata_fs 10: slave colocation services_colo inf: drbd_master_slave:Master vmdata_fs This one should be the other way around: colocation services_colo inf: vmdata_fs drbd_master_slave:Master order fs_after_drbd inf: drbd_master_slave:promote vmdata_fs:start libvirt:start And you need one more collocation: colocation libvirt-with-fs inf: libvirt vmdata_fs HTH, Dejan property $id=cib-bootstrap-options \ dc-version=1.1.10-42f2063 \ cluster-infrastructure=corosync \ stonith-enabled=false \ no-quorum-policy=ignore \ last-lrm-refresh=1415619869 There must be an Error in this configuration, but i dont know in which part. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] drbd / libvirt / Pacemaker Cluster?
And you need to configure your cluster fencing and you need to be sure sure to configure drbd to use the pacemaker fencing http://www.drbd.org/users-guide/s-pacemaker-fencing.html 2014-11-13 14:58 GMT+01:00 Dejan Muhamedagic deja...@fastmail.fm: Hi, On Thu, Nov 13, 2014 at 01:57:08PM +0100, Heiner Meier wrote: Hello, i need an Cluster with drbd, the active Cluster Member should hold a running kvm instance, started via libvirt. A virtual IP is not needet. It runs, but from time to Time it doesnt take over correctly when i reboot the master System, normaly all resources after the machine is up again should migrate back to the master System (via location statement). In the most cases this works, but from time to time drbd failed and the ressources stay on the slave Server, after rebooting the master Server one time more, all is OK. What i later still need ist a automatic drbd Split Brain recovery, if anyone have a working config for this it should be interesting to see it. Here is my pacemaker configuration: node $id=1084777473 master \ attributes standby=off maintenance=off node $id=1084777474 slave \ attributes maintenance=off standby=off primitive libvirt upstart:libvirt-bin \ op start timeout=120s interval=0 \ op stop timeout=120s interval=0 \ op monitor interval=30s \ meta target-role=Started primitive vmdata ocf:linbit:drbd \ params drbd_resource=vmdata \ op monitor interval=29s role=Master \ op monitor interval=31s role=Slave primitive vmdata_fs ocf:heartbeat:Filesystem \ params device=/dev/drbd0 directory=/vmdata fstype=ext4 \ meta target-role=Started ms drbd_master_slave vmdata \ meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true location PrimaryNode-libvirt libvirt 200: master location PrimaryNode-vmdata_fs vmdata_fs 200: master location SecondaryNode-libvirt libvirt 10: slave location SecondaryNode-vmdata_fs vmdata_fs 10: slave colocation services_colo inf: drbd_master_slave:Master vmdata_fs This one should be the other way around: colocation services_colo inf: vmdata_fs drbd_master_slave:Master order fs_after_drbd inf: drbd_master_slave:promote vmdata_fs:start libvirt:start And you need one more collocation: colocation libvirt-with-fs inf: libvirt vmdata_fs HTH, Dejan property $id=cib-bootstrap-options \ dc-version=1.1.10-42f2063 \ cluster-infrastructure=corosync \ stonith-enabled=false \ no-quorum-policy=ignore \ last-lrm-refresh=1415619869 There must be an Error in this configuration, but i dont know in which part. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org -- esta es mi vida e me la vivo hasta que dios quiera ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] resource-discovery question
- Original Message - 12.11.2014 22:57, David Vossel wrote: - Original Message - 12.11.2014 22:04, Vladislav Bogdanov wrote: Hi David, all, I'm trying to get resource-discovery=never working with cd7c9ab, but still get Not installed probe failures from nodes which does not have corresponding resource agents installed. The only difference in my location constraints comparing to what is committed in #589 is that they are rule-based (to match #kind). Is that supposed to work with the current master or still TBD? Yep, after I modified constraint to a rule-less syntax, it works: ahh, good catch. I'll take a look! rsc_location id=vlan003-on-cluster-nodes rsc=vlan003 score=-INFINITY node=rnode001 resource-discovery=never/ But I'd prefer to that killer feature to work with rules too :) Although resource-discovery=exclusive with score 0 for multiple nodes should probably also work for me, correct? yep it should. I cannot test that on a cluster with one cluster node and one remote node. this feature should work the same with remote nodes and cluster nodes. I'll get a patch out for the rule issue. I'm also pushing out some documentation for the resource-discovery option. It seems like you've got a good handle on it already though :) Oh, I see new pull-request, thank you very much! One side question: Is default value for clone-max influenced by resource-discovery value(s)? kind of. with 'exclusive' if the number of nodes in the exclusive set is smaller than clone-max, clone-max is effectively reduced to the node count in the exclusive set. 'never' and 'always' do not directly influence resource placement, only 'exclusive' My location constraints look like: rsc_location id=vlan003-on-cluster-nodes rsc=vlan003 resource-discovery=never rule score=-INFINITY id=vlan003-on-cluster-nodes-rule expression attribute=#kind operation=ne value=cluster id=vlan003-on-cluster-nodes-rule-expression/ /rule /rsc_location Do I miss something? Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] resource-stickiness not working?
Here is a simple Active/Passive configuration with a single Dummy resource (see end of message). The resource-stickiness default is set to 100. I was assuming that this would be enough to keep the Dummy resource on the active node as long as the active node stays healthy. However, stickiness is not working as I expected in the following scenario: 1) The node testnode1, which is running the Dummy resource, reboots or crashes 2) Dummy resource fails to node testnode2 3) testnode1 comes back up after reboot or crash 4) Dummy resource fails back to testnode1 I don't want the resource to failback to the original node in step 4. That is why resource-stickiness is set to 100. The only way I can get the resource to not to fail back is to set resource-stickiness to INFINITY. Is this the correct behavior of resource-stickiness? What am I missing? This is not what I understand from the documentation from clusterlabs.org. BTW, after reading various postings on fail back issues, I played with setting on-fail to standby, but that doesn't seem to help either. Any help is appreciated! Scott node testnode1 node testnode2 primitive dummy ocf:heartbeat:Dummy \ op start timeout=180s interval=0 \ op stop timeout=180s interval=0 \ op monitor interval=60s timeout=60s migration-threshold=5 xml rsc_location id=cli-prefer-dummy rsc=dummy role=Started node=testnode2 score=INFINITY/ property $id=cib-bootstrap-options \ dc-version=1.1.10-14.el6-368c726 \ cluster-infrastructure=classic openais (with plugin) \ expected-quorum-votes=2 \ stonith-enabled=false \ stonith-action=reboot \ no-quorum-policy=ignore \ last-lrm-refresh=1413378119 rsc_defaults $id=rsc-options \ resource-stickiness=100 \ migration-threshold=5 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Notes on pacemaker installation on OmniOS
I am really sorry but I forgot the reason. It is now 2 years ago when I had problems with starting pacemaker as root. When I remember well pacemaker got always access denied when connection to corosync. With a non-root account it worked flawlessly. The pull request from branch upstream3 can be closed. There is a new pull request from branch upstream4 with the changes against the current master. -Ursprüngliche Nachricht- Von: Andrew Beekhof [mailto:and...@beekhof.net] Gesendet: Donnerstag, 13. November 2014 12:11 An: The Pacemaker cluster resource manager Betreff: Re: [Pacemaker] Notes on pacemaker installation on OmniOS On 13 Nov 2014, at 9:50 pm, Grüninger, Andreas (LGL Extern) andreas.gruenin...@lgl.bwl.de wrote: I added heartbeat and corosync to have both available. Personally I use pacemaker/corosync. There is no need any more to run pacemaker as non-root with the newest version of pacemaker. I'm curious... what was the old reason? The main problems with pacemaker are the changes in the last months especially in services_linux.c. As the name implies this must be a problem with non-linux systems. What is your preferred way to handle e.g. pure linux kernel functions? Definitely to isolate them with an appropriate #define (preferably by feature availability rather than OS) I compiled a version of pacemaker yesterday but with a revision of pacemaker from august. There are pull requests waiting with patches for Solaris/Illumos. I guess it would be better to add this patches from august and my patches from yesterday to the current master. Following the patch from Vincenco I changed services_os_action_execute in services_linux.c and added for non-linux systems the synchronous wait with ppoll which is available for Solaris/BSD/MacOS. Should be same functionality as this function uses file descriptors and signal handlers. Can pull requests be rejected or redrawn? Is there anything left in them that needs to go in? If so, can you indicate which parts are needed in those pull requests please? The rest we can close - I didn't want to close them in case there was something I had missed. Andreas -Ursprüngliche Nachricht- Von: Andrew Beekhof [mailto:and...@beekhof.net] Gesendet: Donnerstag, 13. November 2014 11:13 An: The Pacemaker cluster resource manager Betreff: Re: [Pacemaker] Notes on pacemaker installation on OmniOS Interesting work... a couple of questions... - Why heartbeat and corosync? - Why the need to run pacemaker as non-root? Also, I really encourage the kinds of patches referenced in these instructions to bring them to the attention of upstream so that we can work on getting them merged. On 13 Nov 2014, at 7:09 pm, Vincenzo Pii p...@zhaw.ch wrote: Hello, I have written down my notes on the setup of pacemaker and corosync on IllumOS (OmniOS). This is just the basic setup, to be in condition of running the Dummy resource agent. It took me quite some time to get this done, so I want to share what I did assuming that this may help someone else. Here's the link: http://blog.zhaw.ch/icclab/use-pacemaker-and-corosync-on-illumos-omni o s-to-run-a-ha-activepassive-cluster/ A few things: * Maybe this setup is not optimal for how resource agents are managed by the hacluster user instead of root. This led to some problems, check this thread: https://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg20834.h t ml * I took some scripts and the general procedure from Andreas and his page here: http://grueni.github.io/libqb/. Many thanks! Regards, Vincenzo. -- Vincenzo Pii Researcher, InIT Cloud Computing Lab Zurich University of Applied Sciences (ZHAW) blog.zhaw.ch/icclab ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started:
Re: [Pacemaker] Notes on pacemaker installation on OmniOS
On 14 Nov 2014, at 6:54 am, Grüninger, Andreas (LGL Extern) andreas.gruenin...@lgl.bwl.de wrote: I am really sorry but I forgot the reason. It is now 2 years ago when I had problems with starting pacemaker as root. When I remember well pacemaker got always access denied when connection to corosync. With a non-root account it worked flawlessly. Oh That would be this patch: https://github.com/beekhof/pacemaker/commit/3c9275e9 I always thought there was a philosophical objection. The pull request from branch upstream3 can be closed. There is a new pull request from branch upstream4 with the changes against the current master. Excellent -Ursprüngliche Nachricht- Von: Andrew Beekhof [mailto:and...@beekhof.net] Gesendet: Donnerstag, 13. November 2014 12:11 An: The Pacemaker cluster resource manager Betreff: Re: [Pacemaker] Notes on pacemaker installation on OmniOS On 13 Nov 2014, at 9:50 pm, Grüninger, Andreas (LGL Extern) andreas.gruenin...@lgl.bwl.de wrote: I added heartbeat and corosync to have both available. Personally I use pacemaker/corosync. There is no need any more to run pacemaker as non-root with the newest version of pacemaker. I'm curious... what was the old reason? The main problems with pacemaker are the changes in the last months especially in services_linux.c. As the name implies this must be a problem with non-linux systems. What is your preferred way to handle e.g. pure linux kernel functions? Definitely to isolate them with an appropriate #define (preferably by feature availability rather than OS) I compiled a version of pacemaker yesterday but with a revision of pacemaker from august. There are pull requests waiting with patches for Solaris/Illumos. I guess it would be better to add this patches from august and my patches from yesterday to the current master. Following the patch from Vincenco I changed services_os_action_execute in services_linux.c and added for non-linux systems the synchronous wait with ppoll which is available for Solaris/BSD/MacOS. Should be same functionality as this function uses file descriptors and signal handlers. Can pull requests be rejected or redrawn? Is there anything left in them that needs to go in? If so, can you indicate which parts are needed in those pull requests please? The rest we can close - I didn't want to close them in case there was something I had missed. Andreas -Ursprüngliche Nachricht- Von: Andrew Beekhof [mailto:and...@beekhof.net] Gesendet: Donnerstag, 13. November 2014 11:13 An: The Pacemaker cluster resource manager Betreff: Re: [Pacemaker] Notes on pacemaker installation on OmniOS Interesting work... a couple of questions... - Why heartbeat and corosync? - Why the need to run pacemaker as non-root? Also, I really encourage the kinds of patches referenced in these instructions to bring them to the attention of upstream so that we can work on getting them merged. On 13 Nov 2014, at 7:09 pm, Vincenzo Pii p...@zhaw.ch wrote: Hello, I have written down my notes on the setup of pacemaker and corosync on IllumOS (OmniOS). This is just the basic setup, to be in condition of running the Dummy resource agent. It took me quite some time to get this done, so I want to share what I did assuming that this may help someone else. Here's the link: http://blog.zhaw.ch/icclab/use-pacemaker-and-corosync-on-illumos-omni o s-to-run-a-ha-activepassive-cluster/ A few things: * Maybe this setup is not optimal for how resource agents are managed by the hacluster user instead of root. This led to some problems, check this thread: https://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg20834.h t ml * I took some scripts and the general procedure from Andreas and his page here: http://grueni.github.io/libqb/. Many thanks! Regards, Vincenzo. -- Vincenzo Pii Researcher, InIT Cloud Computing Lab Zurich University of Applied Sciences (ZHAW) blog.zhaw.ch/icclab ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: