[Pacemaker] Notes on pacemaker installation on OmniOS

2014-11-13 Thread Vincenzo Pii
Hello,

I have written down my notes on the setup of pacemaker and corosync on
IllumOS (OmniOS).

This is just the basic setup, to be in condition of running the Dummy
resource agent. It took me quite some time to get this done, so I want to
share what I did assuming that this may help someone else.

Here's the link:
http://blog.zhaw.ch/icclab/use-pacemaker-and-corosync-on-illumos-omnios-to-run-a-ha-activepassive-cluster/

A few things:

 * Maybe this setup is not optimal for how resource agents are managed by
the hacluster user instead of root. This led to some problems, check this
thread:
https://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg20834.html
 * I took some scripts and the general procedure from Andreas and his page
here: http://grueni.github.io/libqb/. Many thanks!

Regards,
Vincenzo.

-- 
Vincenzo Pii
Researcher, InIT Cloud Computing Lab
Zurich University of Applied Sciences (ZHAW)
blog.zhaw.ch/icclab
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Notes on pacemaker installation on OmniOS

2014-11-13 Thread LGL Extern
I added heartbeat and corosync to have both available.
Personally I use pacemaker/corosync.

There is no need any more to run pacemaker as non-root with the newest version 
of pacemaker.

The main problems with pacemaker are the changes in the last months especially 
in services_linux.c.
As the name implies this must be a problem with non-linux systems.
What is your preferred way to handle e.g. pure linux kernel functions?

I compiled a version of pacemaker yesterday but with a revision of pacemaker 
from august.
There are pull requests waiting with patches for Solaris/Illumos.
I guess it would be better to add this patches from august and my patches from 
yesterday to the current master.
Following the patch from Vincenco I changed services_os_action_execute in 
services_linux.c and added for non-linux systems the synchronous wait with 
ppoll  which is available for Solaris/BSD/MacOS. Should be same functionality 
as this function uses file descriptors and signal handlers.
Can pull requests be rejected or redrawn?

Andreas


-Ursprüngliche Nachricht-
Von: Andrew Beekhof [mailto:and...@beekhof.net] 
Gesendet: Donnerstag, 13. November 2014 11:13
An: The Pacemaker cluster resource manager
Betreff: Re: [Pacemaker] Notes on pacemaker installation on OmniOS

Interesting work... a couple of questions...

- Why heartbeat and corosync?
- Why the need to run pacemaker as non-root?

Also, I really encourage the kinds of patches referenced in these instructions 
to bring them to the attention of upstream so that we can work on getting them 
merged.

 On 13 Nov 2014, at 7:09 pm, Vincenzo Pii p...@zhaw.ch wrote:
 
 Hello,
 
 I have written down my notes on the setup of pacemaker and corosync on 
 IllumOS (OmniOS).
 
 This is just the basic setup, to be in condition of running the Dummy 
 resource agent. It took me quite some time to get this done, so I want to 
 share what I did assuming that this may help someone else.
 
 Here's the link: 
 http://blog.zhaw.ch/icclab/use-pacemaker-and-corosync-on-illumos-omnio
 s-to-run-a-ha-activepassive-cluster/
 
 A few things:
 
  * Maybe this setup is not optimal for how resource agents are managed 
 by the hacluster user instead of root. This led to some problems, 
 check this thread: 
 https://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg20834.ht
 ml
  * I took some scripts and the general procedure from Andreas and his page 
 here: http://grueni.github.io/libqb/. Many thanks!
 
 Regards,
 Vincenzo.
 
 --
 Vincenzo Pii
 Researcher, InIT Cloud Computing Lab
 Zurich University of Applied Sciences (ZHAW) blog.zhaw.ch/icclab 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org Getting started: 
 http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org Getting started: 
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] Reset failcount for resources

2014-11-13 Thread Arjun Pandey
Hi

I am running a 2 node cluster with this config

Master/Slave Set: foo-master [foo]
Masters: [ bharat ]
Slaves: [ ram ]
AC_FLT (ocf::pw:IPaddr): Started bharat
CR_CP_FLT (ocf::pw:IPaddr): Started bharat
CR_UP_FLT (ocf::pw:IPaddr): Started bharat
Mgmt_FLT (ocf::pw:IPaddr): Started bharat

where IPaddr RA is just modified IPAddr2 RA. Additionally i have a
collocation constraint for the IP addr to be collocated with the master.
I have set the migration-threshold as 2 for the VIP. I also have set the
failure-timeout to 15s.


Initially i bring down the interface on bharat to force switch-over to ram.
After this i fail the interfaces on bharat again. Now i bring the interface
up again on ram. However the virtual IP's are now in stopped state.

I don't get out of this unless i use crm_resource -C to reset state of
resources.
However if i check failcount of resources after this it's still set as
INFINITY.
Based on the documentation the failcount on a node should have expired
after the failure-timeout.That doesn't happen. However why don't we reset
the count after the the crm_resource -C command too. Any other command to
actually reset the failcount.

Thanks in advance

Regards
Arjun
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Notes on pacemaker installation on OmniOS

2014-11-13 Thread Andrew Beekhof

 On 13 Nov 2014, at 9:50 pm, Grüninger, Andreas (LGL Extern) 
 andreas.gruenin...@lgl.bwl.de wrote:
 
 I added heartbeat and corosync to have both available.
 Personally I use pacemaker/corosync.
 
 There is no need any more to run pacemaker as non-root with the newest 
 version of pacemaker.

I'm curious... what was the old reason?

 
 The main problems with pacemaker are the changes in the last months 
 especially in services_linux.c.
 As the name implies this must be a problem with non-linux systems.
 What is your preferred way to handle e.g. pure linux kernel functions?

Definitely to isolate them with an appropriate #define (preferably by feature 
availability rather than OS)

 
 I compiled a version of pacemaker yesterday but with a revision of pacemaker 
 from august.
 There are pull requests waiting with patches for Solaris/Illumos.
 I guess it would be better to add this patches from august and my patches 
 from yesterday to the current master.
 Following the patch from Vincenco I changed services_os_action_execute in 
 services_linux.c and added for non-linux systems the synchronous wait with 
 ppoll  which is available for Solaris/BSD/MacOS. Should be same functionality 
 as this function uses file descriptors and signal handlers.
 Can pull requests be rejected or redrawn?

Is there anything left in them that needs to go in?
If so, can you indicate which parts are needed in those pull requests please?
The rest we can close - I didn't want to close them in case there was something 
I had missed.

 
 Andreas
 
 
 -Ursprüngliche Nachricht-
 Von: Andrew Beekhof [mailto:and...@beekhof.net] 
 Gesendet: Donnerstag, 13. November 2014 11:13
 An: The Pacemaker cluster resource manager
 Betreff: Re: [Pacemaker] Notes on pacemaker installation on OmniOS
 
 Interesting work... a couple of questions...
 
 - Why heartbeat and corosync?
 - Why the need to run pacemaker as non-root?
 
 Also, I really encourage the kinds of patches referenced in these 
 instructions to bring them to the attention of upstream so that we can work 
 on getting them merged.
 
 On 13 Nov 2014, at 7:09 pm, Vincenzo Pii p...@zhaw.ch wrote:
 
 Hello,
 
 I have written down my notes on the setup of pacemaker and corosync on 
 IllumOS (OmniOS).
 
 This is just the basic setup, to be in condition of running the Dummy 
 resource agent. It took me quite some time to get this done, so I want to 
 share what I did assuming that this may help someone else.
 
 Here's the link: 
 http://blog.zhaw.ch/icclab/use-pacemaker-and-corosync-on-illumos-omnio
 s-to-run-a-ha-activepassive-cluster/
 
 A few things:
 
 * Maybe this setup is not optimal for how resource agents are managed 
 by the hacluster user instead of root. This led to some problems, 
 check this thread: 
 https://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg20834.ht
 ml
 * I took some scripts and the general procedure from Andreas and his page 
 here: http://grueni.github.io/libqb/. Many thanks!
 
 Regards,
 Vincenzo.
 
 --
 Vincenzo Pii
 Researcher, InIT Cloud Computing Lab
 Zurich University of Applied Sciences (ZHAW) blog.zhaw.ch/icclab 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org Getting started: 
 http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org
 
 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org Getting started: 
 http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org
 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] drbd / libvirt / Pacemaker Cluster?

2014-11-13 Thread Heiner Meier
Hello,

i need an Cluster with drbd, the active Cluster Member should hold a
running kvm instance, started via libvirt.

A virtual IP is not needet.

It runs, but from time to Time it doesnt take over correctly when i
reboot the master System, normaly all resources after the machine is
up again should migrate back to the master System (via location statement).

In the most cases this works, but from time to time drbd failed and the
ressources stay on the slave Server, after rebooting the master Server
one time more, all is OK.

What i later still need ist a automatic drbd Split Brain recovery, if
anyone have a working config for this it should be interesting to see it.

Here is my pacemaker configuration:

node $id=1084777473 master \
attributes standby=off maintenance=off
node $id=1084777474 slave \
attributes maintenance=off standby=off
primitive libvirt upstart:libvirt-bin \
op start timeout=120s interval=0 \
op stop timeout=120s interval=0 \
op monitor interval=30s \
meta target-role=Started
primitive vmdata ocf:linbit:drbd \
params drbd_resource=vmdata \
op monitor interval=29s role=Master \
op monitor interval=31s role=Slave
primitive vmdata_fs ocf:heartbeat:Filesystem \
params device=/dev/drbd0 directory=/vmdata fstype=ext4 \
meta target-role=Started
ms drbd_master_slave vmdata \
meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1
notify=true
location PrimaryNode-libvirt libvirt 200: master
location PrimaryNode-vmdata_fs vmdata_fs 200: master
location SecondaryNode-libvirt libvirt 10: slave
location SecondaryNode-vmdata_fs vmdata_fs 10: slave
colocation services_colo inf: drbd_master_slave:Master vmdata_fs
order fs_after_drbd inf: drbd_master_slave:promote vmdata_fs:start
libvirt:start
property $id=cib-bootstrap-options \
dc-version=1.1.10-42f2063 \
cluster-infrastructure=corosync \
stonith-enabled=false \
no-quorum-policy=ignore \
last-lrm-refresh=1415619869


There must be an Error in this configuration, but i dont know in which
part.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] TOTEM Retransmit list in logs when a node gets up

2014-11-13 Thread Daniel Dehennin
Hello,

My cluster seems to works correctly but when I start corosync and
pacemaker on one of them[1] I start to have some TOTEM logs like this:

#+begin_src
Nov 13 14:00:10 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4e 4f
Nov 13 14:00:10 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 46 47 48 49 
4a 4b 4c 4d 4e 4f
Nov 13 14:00:10 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4b 4c 4d 4e 
4f
Nov 13 14:00:30 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 47 48 49 4a 
4b 4c 4d 4e 4f
Nov 13 14:00:30 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 47 48 49 4a 
4b 4c 4d 4e 4f
Nov 13 14:00:30 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4e 4f
Nov 13 14:00:30 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4e 4f
Nov 13 14:00:30 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4b 4c 4d 4e 
4f
Nov 13 14:00:30 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4e 4f
Nov 13 14:00:35 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4c 4d 4e 4f
Nov 13 14:00:35 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4c 4d 4e 4f
Nov 13 14:00:35 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4d 4e 4f
Nov 13 14:00:35 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 47 48 49 4a 
4b 4c 4d 4e 4f
Nov 13 14:00:35 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4a 4b 4c 4d 
4e 4f
Nov 13 14:00:35 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4b 4c 4d 4e 
4f
#+end_src

I do not understand what happens, do you have any hints?

Regards.

Footnotes: 
[1]  the VM using two cards 
http://oss.clusterlabs.org/pipermail/pacemaker/2014-November/022962.html

-- 
Daniel Dehennin
Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF
Fingerprint: 3E69 014E 5C23 50E8 9ED6  2AAD CC1E 9E5B 7A6F E2DF


signature.asc
Description: PGP signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] drbd / libvirt / Pacemaker Cluster?

2014-11-13 Thread Dejan Muhamedagic
Hi,

On Thu, Nov 13, 2014 at 01:57:08PM +0100, Heiner Meier wrote:
 Hello,
 
 i need an Cluster with drbd, the active Cluster Member should hold a
 running kvm instance, started via libvirt.
 
 A virtual IP is not needet.
 
 It runs, but from time to Time it doesnt take over correctly when i
 reboot the master System, normaly all resources after the machine is
 up again should migrate back to the master System (via location statement).
 
 In the most cases this works, but from time to time drbd failed and the
 ressources stay on the slave Server, after rebooting the master Server
 one time more, all is OK.
 
 What i later still need ist a automatic drbd Split Brain recovery, if
 anyone have a working config for this it should be interesting to see it.
 
 Here is my pacemaker configuration:
 
 node $id=1084777473 master \
 attributes standby=off maintenance=off
 node $id=1084777474 slave \
 attributes maintenance=off standby=off
 primitive libvirt upstart:libvirt-bin \
 op start timeout=120s interval=0 \
 op stop timeout=120s interval=0 \
 op monitor interval=30s \
 meta target-role=Started
 primitive vmdata ocf:linbit:drbd \
 params drbd_resource=vmdata \
 op monitor interval=29s role=Master \
 op monitor interval=31s role=Slave
 primitive vmdata_fs ocf:heartbeat:Filesystem \
 params device=/dev/drbd0 directory=/vmdata fstype=ext4 \
 meta target-role=Started
 ms drbd_master_slave vmdata \
 meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1
 notify=true
 location PrimaryNode-libvirt libvirt 200: master
 location PrimaryNode-vmdata_fs vmdata_fs 200: master
 location SecondaryNode-libvirt libvirt 10: slave
 location SecondaryNode-vmdata_fs vmdata_fs 10: slave
 colocation services_colo inf: drbd_master_slave:Master vmdata_fs

This one should be the other way around:

colocation services_colo inf: vmdata_fs drbd_master_slave:Master

 order fs_after_drbd inf: drbd_master_slave:promote vmdata_fs:start
 libvirt:start

And you need one more collocation:

colocation libvirt-with-fs inf: libvirt vmdata_fs

HTH,

Dejan

 property $id=cib-bootstrap-options \
 dc-version=1.1.10-42f2063 \
 cluster-infrastructure=corosync \
 stonith-enabled=false \
 no-quorum-policy=ignore \
 last-lrm-refresh=1415619869
 
 
 There must be an Error in this configuration, but i dont know in which
 part.
 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] drbd / libvirt / Pacemaker Cluster?

2014-11-13 Thread emmanuel segura
And you need to configure your cluster fencing and you need to be sure
sure to configure drbd to use the pacemaker fencing
http://www.drbd.org/users-guide/s-pacemaker-fencing.html

2014-11-13 14:58 GMT+01:00 Dejan Muhamedagic deja...@fastmail.fm:
 Hi,

 On Thu, Nov 13, 2014 at 01:57:08PM +0100, Heiner Meier wrote:
 Hello,

 i need an Cluster with drbd, the active Cluster Member should hold a
 running kvm instance, started via libvirt.

 A virtual IP is not needet.

 It runs, but from time to Time it doesnt take over correctly when i
 reboot the master System, normaly all resources after the machine is
 up again should migrate back to the master System (via location statement).

 In the most cases this works, but from time to time drbd failed and the
 ressources stay on the slave Server, after rebooting the master Server
 one time more, all is OK.

 What i later still need ist a automatic drbd Split Brain recovery, if
 anyone have a working config for this it should be interesting to see it.

 Here is my pacemaker configuration:

 node $id=1084777473 master \
 attributes standby=off maintenance=off
 node $id=1084777474 slave \
 attributes maintenance=off standby=off
 primitive libvirt upstart:libvirt-bin \
 op start timeout=120s interval=0 \
 op stop timeout=120s interval=0 \
 op monitor interval=30s \
 meta target-role=Started
 primitive vmdata ocf:linbit:drbd \
 params drbd_resource=vmdata \
 op monitor interval=29s role=Master \
 op monitor interval=31s role=Slave
 primitive vmdata_fs ocf:heartbeat:Filesystem \
 params device=/dev/drbd0 directory=/vmdata fstype=ext4 \
 meta target-role=Started
 ms drbd_master_slave vmdata \
 meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1
 notify=true
 location PrimaryNode-libvirt libvirt 200: master
 location PrimaryNode-vmdata_fs vmdata_fs 200: master
 location SecondaryNode-libvirt libvirt 10: slave
 location SecondaryNode-vmdata_fs vmdata_fs 10: slave
 colocation services_colo inf: drbd_master_slave:Master vmdata_fs

 This one should be the other way around:

 colocation services_colo inf: vmdata_fs drbd_master_slave:Master

 order fs_after_drbd inf: drbd_master_slave:promote vmdata_fs:start
 libvirt:start

 And you need one more collocation:

 colocation libvirt-with-fs inf: libvirt vmdata_fs

 HTH,

 Dejan

 property $id=cib-bootstrap-options \
 dc-version=1.1.10-42f2063 \
 cluster-infrastructure=corosync \
 stonith-enabled=false \
 no-quorum-policy=ignore \
 last-lrm-refresh=1415619869


 There must be an Error in this configuration, but i dont know in which
 part.

 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org

 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org



-- 
esta es mi vida e me la vivo hasta que dios quiera

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] resource-discovery question

2014-11-13 Thread David Vossel


- Original Message -
 12.11.2014 22:57, David Vossel wrote:
  
  
  - Original Message -
  12.11.2014 22:04, Vladislav Bogdanov wrote:
  Hi David, all,
 
  I'm trying to get resource-discovery=never working with cd7c9ab, but
  still
  get Not installed probe failures from nodes which does not have
  corresponding resource agents installed.
 
  The only difference in my location constraints comparing to what is
  committed in #589
  is that they are rule-based (to match #kind). Is that supposed to work
  with
  the
  current master or still TBD?
 
  Yep, after I modified constraint to a rule-less syntax, it works:
  
  ahh, good catch. I'll take a look!
  
 
  rsc_location id=vlan003-on-cluster-nodes rsc=vlan003
  score=-INFINITY
  node=rnode001 resource-discovery=never/
 
  But I'd prefer to that killer feature to work with rules too :)
  Although resource-discovery=exclusive with score 0 for multiple nodes
  should probably
  also work for me, correct?
  
  yep it should.
  
  I cannot test that on a cluster with one cluster
  node and one
  remote node.
  
  this feature should work the same with remote nodes and cluster nodes.
  
  I'll get a patch out for the rule issue. I'm also pushing out some
  documentation
  for the resource-discovery option. It seems like you've got a good handle
  on it
  already though :)
 
 Oh, I see new pull-request, thank you very much!
 
 One side question: Is default value for clone-max influenced by
 resource-discovery value(s)?

kind of.

with 'exclusive' if the number of nodes in the exclusive set is smaller
than clone-max, clone-max is effectively reduced to the node count in
the exclusive set.

'never' and 'always' do not directly influence resource placement, only
'exclusive'


 
 
  
 
  My location constraints look like:
 
rsc_location id=vlan003-on-cluster-nodes rsc=vlan003
resource-discovery=never
  rule score=-INFINITY id=vlan003-on-cluster-nodes-rule
expression attribute=#kind operation=ne value=cluster
id=vlan003-on-cluster-nodes-rule-expression/
  /rule
/rsc_location
 
  Do I miss something?
 
  Best,
  Vladislav
 
  ___
  Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
  http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
  Project Home: http://www.clusterlabs.org
  Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
  Bugs: http://bugs.clusterlabs.org
 
 
 
 
 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] resource-stickiness not working?

2014-11-13 Thread Scott Donoho
Here is a simple Active/Passive configuration with a single Dummy resource (see 
end of message). The resource-stickiness default is set to 100. I was assuming 
that this would be enough to keep the Dummy resource on the active node as long 
as the active node stays healthy. However, stickiness is not working as I 
expected in the following scenario:

1) The node testnode1, which is running the Dummy resource, reboots or crashes
2) Dummy resource fails to node testnode2
3) testnode1 comes back up after reboot or crash
4) Dummy resource fails back to testnode1

I don't want the resource  to failback to the original node in step 4. That is 
why resource-stickiness is set to 100. The only way I can get the resource to 
not to fail back is to set resource-stickiness to INFINITY. Is this the correct 
behavior of resource-stickiness? What am I missing? This is not what I 
understand from the documentation from clusterlabs.org. BTW, after reading 
various postings on fail back issues, I played with setting on-fail to standby, 
but that doesn't seem to help either. Any help is appreciated!

   Scott

node testnode1
node testnode2
primitive dummy ocf:heartbeat:Dummy \
op start timeout=180s interval=0 \
op stop timeout=180s interval=0 \
op monitor interval=60s timeout=60s migration-threshold=5
xml rsc_location id=cli-prefer-dummy rsc=dummy role=Started 
node=testnode2 score=INFINITY/
property $id=cib-bootstrap-options \
dc-version=1.1.10-14.el6-368c726 \
cluster-infrastructure=classic openais (with plugin) \
expected-quorum-votes=2 \
stonith-enabled=false \
stonith-action=reboot \
no-quorum-policy=ignore \
last-lrm-refresh=1413378119
rsc_defaults $id=rsc-options \
resource-stickiness=100 \
migration-threshold=5




___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Notes on pacemaker installation on OmniOS

2014-11-13 Thread LGL Extern
I am really sorry but I forgot the reason. It is now 2 years ago when I had 
problems with starting pacemaker as root.
When I remember well pacemaker got always access denied when connection to 
corosync.
With a non-root account it worked flawlessly.

The pull request from branch upstream3 can be closed.
There is a new pull request from branch upstream4 with the changes against the 
current master.


-Ursprüngliche Nachricht-
Von: Andrew Beekhof [mailto:and...@beekhof.net] 
Gesendet: Donnerstag, 13. November 2014 12:11
An: The Pacemaker cluster resource manager
Betreff: Re: [Pacemaker] Notes on pacemaker installation on OmniOS


 On 13 Nov 2014, at 9:50 pm, Grüninger, Andreas (LGL Extern) 
 andreas.gruenin...@lgl.bwl.de wrote:
 
 I added heartbeat and corosync to have both available.
 Personally I use pacemaker/corosync.
 
 There is no need any more to run pacemaker as non-root with the newest 
 version of pacemaker.

I'm curious... what was the old reason?

 
 The main problems with pacemaker are the changes in the last months 
 especially in services_linux.c.
 As the name implies this must be a problem with non-linux systems.
 What is your preferred way to handle e.g. pure linux kernel functions?

Definitely to isolate them with an appropriate #define (preferably by feature 
availability rather than OS)

 
 I compiled a version of pacemaker yesterday but with a revision of pacemaker 
 from august.
 There are pull requests waiting with patches for Solaris/Illumos.
 I guess it would be better to add this patches from august and my patches 
 from yesterday to the current master.
 Following the patch from Vincenco I changed services_os_action_execute in 
 services_linux.c and added for non-linux systems the synchronous wait with 
 ppoll  which is available for Solaris/BSD/MacOS. Should be same functionality 
 as this function uses file descriptors and signal handlers.
 Can pull requests be rejected or redrawn?

Is there anything left in them that needs to go in?
If so, can you indicate which parts are needed in those pull requests please?
The rest we can close - I didn't want to close them in case there was something 
I had missed.

 
 Andreas
 
 
 -Ursprüngliche Nachricht-
 Von: Andrew Beekhof [mailto:and...@beekhof.net]
 Gesendet: Donnerstag, 13. November 2014 11:13
 An: The Pacemaker cluster resource manager
 Betreff: Re: [Pacemaker] Notes on pacemaker installation on OmniOS
 
 Interesting work... a couple of questions...
 
 - Why heartbeat and corosync?
 - Why the need to run pacemaker as non-root?
 
 Also, I really encourage the kinds of patches referenced in these 
 instructions to bring them to the attention of upstream so that we can work 
 on getting them merged.
 
 On 13 Nov 2014, at 7:09 pm, Vincenzo Pii p...@zhaw.ch wrote:
 
 Hello,
 
 I have written down my notes on the setup of pacemaker and corosync on 
 IllumOS (OmniOS).
 
 This is just the basic setup, to be in condition of running the Dummy 
 resource agent. It took me quite some time to get this done, so I want to 
 share what I did assuming that this may help someone else.
 
 Here's the link: 
 http://blog.zhaw.ch/icclab/use-pacemaker-and-corosync-on-illumos-omni
 o
 s-to-run-a-ha-activepassive-cluster/
 
 A few things:
 
 * Maybe this setup is not optimal for how resource agents are managed 
 by the hacluster user instead of root. This led to some problems, 
 check this thread:
 https://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg20834.h
 t
 ml
 * I took some scripts and the general procedure from Andreas and his page 
 here: http://grueni.github.io/libqb/. Many thanks!
 
 Regards,
 Vincenzo.
 
 --
 Vincenzo Pii
 Researcher, InIT Cloud Computing Lab
 Zurich University of Applied Sciences (ZHAW) blog.zhaw.ch/icclab 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org Getting started: 
 http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org
 
 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org Getting started: 
 http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org
 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org Getting started: 
 http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org Getting started: 

Re: [Pacemaker] Notes on pacemaker installation on OmniOS

2014-11-13 Thread Andrew Beekhof

 On 14 Nov 2014, at 6:54 am, Grüninger, Andreas (LGL Extern) 
 andreas.gruenin...@lgl.bwl.de wrote:
 
 I am really sorry but I forgot the reason. It is now 2 years ago when I had 
 problems with starting pacemaker as root.
 When I remember well pacemaker got always access denied when connection to 
 corosync.
 With a non-root account it worked flawlessly.


Oh That would be this patch: 
https://github.com/beekhof/pacemaker/commit/3c9275e9
I always thought there was a philosophical objection.


 
 The pull request from branch upstream3 can be closed.
 There is a new pull request from branch upstream4 with the changes against 
 the current master.

Excellent

 
 
 -Ursprüngliche Nachricht-
 Von: Andrew Beekhof [mailto:and...@beekhof.net] 
 Gesendet: Donnerstag, 13. November 2014 12:11
 An: The Pacemaker cluster resource manager
 Betreff: Re: [Pacemaker] Notes on pacemaker installation on OmniOS
 
 
 On 13 Nov 2014, at 9:50 pm, Grüninger, Andreas (LGL Extern) 
 andreas.gruenin...@lgl.bwl.de wrote:
 
 I added heartbeat and corosync to have both available.
 Personally I use pacemaker/corosync.
 
 There is no need any more to run pacemaker as non-root with the newest 
 version of pacemaker.
 
 I'm curious... what was the old reason?
 
 
 The main problems with pacemaker are the changes in the last months 
 especially in services_linux.c.
 As the name implies this must be a problem with non-linux systems.
 What is your preferred way to handle e.g. pure linux kernel functions?
 
 Definitely to isolate them with an appropriate #define (preferably by feature 
 availability rather than OS)
 
 
 I compiled a version of pacemaker yesterday but with a revision of pacemaker 
 from august.
 There are pull requests waiting with patches for Solaris/Illumos.
 I guess it would be better to add this patches from august and my patches 
 from yesterday to the current master.
 Following the patch from Vincenco I changed services_os_action_execute in 
 services_linux.c and added for non-linux systems the synchronous wait with 
 ppoll  which is available for Solaris/BSD/MacOS. Should be same 
 functionality as this function uses file descriptors and signal handlers.
 Can pull requests be rejected or redrawn?
 
 Is there anything left in them that needs to go in?
 If so, can you indicate which parts are needed in those pull requests please?
 The rest we can close - I didn't want to close them in case there was 
 something I had missed.
 
 
 Andreas
 
 
 -Ursprüngliche Nachricht-
 Von: Andrew Beekhof [mailto:and...@beekhof.net]
 Gesendet: Donnerstag, 13. November 2014 11:13
 An: The Pacemaker cluster resource manager
 Betreff: Re: [Pacemaker] Notes on pacemaker installation on OmniOS
 
 Interesting work... a couple of questions...
 
 - Why heartbeat and corosync?
 - Why the need to run pacemaker as non-root?
 
 Also, I really encourage the kinds of patches referenced in these 
 instructions to bring them to the attention of upstream so that we can work 
 on getting them merged.
 
 On 13 Nov 2014, at 7:09 pm, Vincenzo Pii p...@zhaw.ch wrote:
 
 Hello,
 
 I have written down my notes on the setup of pacemaker and corosync on 
 IllumOS (OmniOS).
 
 This is just the basic setup, to be in condition of running the Dummy 
 resource agent. It took me quite some time to get this done, so I want to 
 share what I did assuming that this may help someone else.
 
 Here's the link: 
 http://blog.zhaw.ch/icclab/use-pacemaker-and-corosync-on-illumos-omni
 o
 s-to-run-a-ha-activepassive-cluster/
 
 A few things:
 
 * Maybe this setup is not optimal for how resource agents are managed 
 by the hacluster user instead of root. This led to some problems, 
 check this thread:
 https://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg20834.h
 t
 ml
 * I took some scripts and the general procedure from Andreas and his page 
 here: http://grueni.github.io/libqb/. Many thanks!
 
 Regards,
 Vincenzo.
 
 --
 Vincenzo Pii
 Researcher, InIT Cloud Computing Lab
 Zurich University of Applied Sciences (ZHAW) blog.zhaw.ch/icclab 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org Getting started: 
 http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org
 
 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org Getting started: 
 http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org
 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org Getting started: 
 http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: