Re: [ClusterLabs] Question about fence_mpath

2017-04-27 Thread Chris Adams
Once upon a time, Seth Reid  said:
> This is part of my multipath.conf that shows the key.
> $ cat /etc/multipath.conf
> defaults {
> user_friendly_names yes
> find_multipaths yes
> reservation_key 33c5
> }

Ah, I should have checked the multipath.conf man page (was just looking
at the fence_mpath page).  Thanks!

-- 
Chris Adams 

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] resource group vs colocation

2017-04-27 Thread Ken Gaillot
On 04/27/2017 02:02 PM, lejeczek wrote:
> hi everyone
> 
> I have a group and I'm trying to colocate - sounds strange - order with
> the group is not how I want it.
> I was hoping that with colocation set I can reorder the resources - can
> I? Because .. something, or my is not getting there.
> I have within a group:
> 
> IP
> mount
> smb
> IP1
> 
> and I colocated sets:
> 
> set IP IP1 sequential=false set mount smb
> 
> and yet smb would not start on IP1. I see resource are still being order
> as they list.
> 
> Could somebody shed more light on what is wrong and group vs colocation
> subject?
> 
> m. thanks
> L.

A group is a shorthand for colocation and order constraints between its
members. So, you should use either a group, or a colocation set, but not
both with the same members.

If you simply want to reorder the sequence in which the group members
start, just recreate the group, listing them in the order you want. That
is, the first member of the group will be started first, then the second
member, etc.

If you prefer using sets, then don't group the resources -- use separate
colocation and ordering constraints with the sets, as desired.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Question about fence_mpath

2017-04-27 Thread Seth Reid
>
> The man page talks about it
>
unique to each node (but you only create the STONITH object from one
> node, right?).


The key is unique to each node, but there is only on stonith object. For
fence_mpath, they are putting scsi keys on a shared stonith device.

It also says it has to be set in /etc/multipath.conf
> but then doesn't say how/where.
>

This is part of my multipath.conf that shows the key.
$ cat /etc/multipath.conf
defaults {
user_friendly_names yes
find_multipaths yes
reservation_key 33c5
}

I am trying to set up a new cluster using fence_mpath.  I'm not sure
> what to use for the "key" value though.


I used the fence_scsi program, directly, to figure out the key when I
initially set up my clusters. With what I know now. I could make up a key.
Its a 8 character hex key. When you see it with mpath_persist, it will look
like 0x33c5 (in the example above. It usually looks like that in the
logs, but don't put it like that in the config.
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Question about fence_mpath

2017-04-27 Thread Chris Adams
I am trying to set up a new cluster using fence_mpath.  I'm not sure
what to use for the "key" value though.  The man page talks about it
unique to each node (but you only create the STONITH object from one
node, right?).  It also says it has to be set in /etc/multipath.conf
but then doesn't say how/where.

-- 
Chris Adams 

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] resource group vs colocation

2017-04-27 Thread lejeczek

hi everyone

I have a group and I'm trying to colocate - sounds strange - 
order with the group is not how I want it.
I was hoping that with colocation set I can reorder the 
resources - can I? Because .. something, or my is not 
getting there.

I have within a group:

IP
mount
smb
IP1

and I colocated sets:

set IP IP1 sequential=false set mount smb

and yet smb would not start on IP1. I see resource are still 
being order as they list.


Could somebody shed more light on what is wrong and group vs 
colocation subject?


m. thanks
L.


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: in standby but still running resources..

2017-04-27 Thread lejeczek



On 27/04/17 14:35, Ulrich Windl wrote:

Did you try a rebrobe, watching messages?

Regards,
Ulrich


lejeczek  schrieb am 27.04.2017 um 15:29 in Nachricht

<0e1570ac-995a-371c-a05d-2c16d42cf...@yahoo.co.uk>:

.. is this ok?

hi guys,

pcs shows no errors after I did standby node, but pcs shows
resources still are being ran on the node I just stoodby.
Is this normal?

0.9.152 @C7.3
thanks
P.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


I think it happens when I use order constraint. I do 
something like:

start A then Z
start B then Z
start C then Z
All the resources are in one group, no co/location constraints.
And then I standby node, pcs shows is standby but resources 
are still on the node.

Should be easy to replicate in case it's a bug.






___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] in standby but still running resources..

2017-04-27 Thread Ken Gaillot
On 04/27/2017 08:29 AM, lejeczek wrote:
> .. is this ok?
> 
> hi guys,
> 
> pcs shows no errors after I did standby node, but pcs shows resources
> still are being ran on the node I just stoodby.
> Is this normal?
> 
> 0.9.152 @C7.3
> thanks
> P.

That should happen only for as long as it takes to stop the resources
there. If it's an ongoing condition, something is wrong.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Corosync CPU load slowly increasing if one node present

2017-04-27 Thread Jan Friesse

Stefan,


Hello everyone!

I am using Pacemaker (1.1.12), Corosync (2.3.0) and libqb (0.16.0) in 2-node 
clusters (virtualized in VMware infrastructure, OS: RHEL 6.7).
I noticed that if only one node is present, the CPU usage of Corosync (as seen 
with top) is slowly but steadily increasing (over days; in my setting about 1% 
per day). The node is basically idle, some Pacemaker managed resources are 
running but they are not contacted by any clients.
I upgraded a test stand-alone node to Corosync (2.4.2) and libqb (1.0.1) (which 
at least made the memleak go away), but the CPU usage is still increasing on 
the node.
When I add a second node to the cluster, the CPU load drops back down to a 
normal (low) CPU usage.
I haven't witnessed the increasing CPU load yet if two nodes were present in a 
cluster.

Even if running Pacemaker/Corosync as a massive-overkill-Monit-replacement is 
questionable, the observed CPU-load is not what I expect to happen.

What could be the reason for this CPU-load increase? Is there a rational behind 
this?


This is really interesting observation. I can talk about corosync and I 
must say no, there is no rationale behind. It simply shouldn't be 
happening. Also I don't see any reason why connection of other node(s) 
could help to remove CPU-load.



Is this a config thing or something in the binaries?


For sure not in corosync. Also your config file looks just ok.

Could you test single ring only and udpu if behavior stays same?

Regards,
  Honza



BR, Stefan

My corosync.conf:

# Please read the corosync.conf.5 manual page
compatibility: whitetank

aisexec {
 user:root
 group:root
}

totem {
 version: 2

 # Security configuration
 secauth: on
 threads: 0

 # Timeout for token
 token: 1000
 token_retransmits_before_loss_const: 4

 # Number of messages that may be sent by one processor on receipt of 
the token
 max_messages: 20

 # How long to wait for join messages in the membership protocol (ms)
 join: 50
 consensus: 1200

 # Turn off the virtual synchrony filter
 vsftype: none

 # Stagger sending the node join messages by 1..send_join ms
 send_join: 50

 # Limit generated nodeids to 31-bits (positive signed integers)
 clear_node_high_bit: yes

 # Interface configuration
 rrp_mode: passive
 interface {
 ringnumber: 0
 bindnetaddr: 10.20.30.0
 mcastaddr: 226.95.30.100
 mcastport: 5510
 }
 interface {
 ringnumber: 1
 bindnetaddr: 10.20.31.0
 mcastaddr: 226.95.31.100
 mcastport: 5510
 }
}

logging {
 fileline: off
 to_stderr: no
 to_logfile: no
 to_syslog: yes
 syslog_facility: local3
 debug: off
}

amf {
 mode: disabled
}

quorum {
 provider: corosync_votequorum
 expected_votes: 1
}

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Coming in Pacemaker 1.1.17: start a node in standby

2017-04-27 Thread Jehan-Guillaume de Rorthais
On Thu, 27 Apr 2017 16:07:11 +0200
Lars Ellenberg  wrote:

> On Thu, Apr 27, 2017 at 09:19:55AM +0200, Jehan-Guillaume de Rorthais wrote:
> > > > > I seem to remember that at some deployment,
> > > > > we set the node instance attribute standby=on, always,
> > > > > and took it out of standby using the node_state
> > > > > transient_attribute :-)
> > > > > 
> > > > > As in
> > > > > # crm node standby ava  
> 
> > > > > # crm node status-attr ava set standby off  
> 
> > > Well, you want the "persistent" setting "on",
> > > and override it with a "transient" setting "off".  
> 
> > Quick questions:
> > 
> >   * is it what happen in the CIB when you call crm_standby?  
> 
> crm_standby --node emma --lifetime reboot --update off
> crm_standby --node emma --lifetime forever --update on
> 
> (or -n -l -v,
> default node is current node,
> default lifetime is forever)
> 
> >   * is it possible to do the opposite? persistent setting "off" and
> > override it with the transient setting?  
> 
> see above, also man crm_standby,
> which again is only a wrapper around crm_attribute.

Thank you!

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Coming in Pacemaker 1.1.17: start a node in standby

2017-04-27 Thread Lars Ellenberg
On Thu, Apr 27, 2017 at 09:19:55AM +0200, Jehan-Guillaume de Rorthais wrote:
> > > > I seem to remember that at some deployment,
> > > > we set the node instance attribute standby=on, always,
> > > > and took it out of standby using the node_state transient_attribute :-)
> > > > 
> > > > As in
> > > > # crm node standby ava

> > > > # crm node status-attr ava set standby off

> > Well, you want the "persistent" setting "on",
> > and override it with a "transient" setting "off".

> Quick questions:
> 
>   * is it what happen in the CIB when you call crm_standby?

crm_standby --node emma --lifetime reboot --update off
crm_standby --node emma --lifetime forever --update on

(or -n -l -v,
default node is current node,
default lifetime is forever)

>   * is it possible to do the opposite? persistent setting "off" and override 
> it
> with the transient setting?

see above, also man crm_standby,
which again is only a wrapper around crm_attribute.

-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
: R, Integration, Ops, Consulting, Support

DRBD® and LINBIT® are registered trademarks of LINBIT

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Corosync CPU load slowly increasing if one node present

2017-04-27 Thread Stefan Kohlhauser
Hello everyone!

I am using Pacemaker (1.1.12), Corosync (2.3.0) and libqb (0.16.0) in 2-node 
clusters (virtualized in VMware infrastructure, OS: RHEL 6.7).
I noticed that if only one node is present, the CPU usage of Corosync (as seen 
with top) is slowly but steadily increasing (over days; in my setting about 1% 
per day). The node is basically idle, some Pacemaker managed resources are 
running but they are not contacted by any clients.
I upgraded a test stand-alone node to Corosync (2.4.2) and libqb (1.0.1) (which 
at least made the memleak go away), but the CPU usage is still increasing on 
the node.
When I add a second node to the cluster, the CPU load drops back down to a 
normal (low) CPU usage.
I haven't witnessed the increasing CPU load yet if two nodes were present in a 
cluster.

Even if running Pacemaker/Corosync as a massive-overkill-Monit-replacement is 
questionable, the observed CPU-load is not what I expect to happen.

What could be the reason for this CPU-load increase? Is there a rational behind 
this?
Is this a config thing or something in the binaries?

BR, Stefan

My corosync.conf:

# Please read the corosync.conf.5 manual page
compatibility: whitetank

aisexec {
user:root
group:root
}

totem {
version: 2

# Security configuration
secauth: on
threads: 0

# Timeout for token
token: 1000
token_retransmits_before_loss_const: 4

# Number of messages that may be sent by one processor on receipt of 
the token
max_messages: 20

# How long to wait for join messages in the membership protocol (ms)
join: 50
consensus: 1200

# Turn off the virtual synchrony filter
vsftype: none

# Stagger sending the node join messages by 1..send_join ms
send_join: 50

# Limit generated nodeids to 31-bits (positive signed integers)
clear_node_high_bit: yes

# Interface configuration
rrp_mode: passive
interface {
ringnumber: 0
bindnetaddr: 10.20.30.0
mcastaddr: 226.95.30.100
mcastport: 5510
}
interface {
ringnumber: 1
bindnetaddr: 10.20.31.0
mcastaddr: 226.95.31.100
mcastport: 5510
}
}

logging {
fileline: off
to_stderr: no
to_logfile: no
to_syslog: yes
syslog_facility: local3
debug: off
}

amf {
mode: disabled
}

quorum {
provider: corosync_votequorum
expected_votes: 1
}

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: in standby but still running resources..

2017-04-27 Thread Ulrich Windl
Did you try a rebrobe, watching messages?

Regards,
Ulrich

>>> lejeczek  schrieb am 27.04.2017 um 15:29 in Nachricht
<0e1570ac-995a-371c-a05d-2c16d42cf...@yahoo.co.uk>:
> .. is this ok?
> 
> hi guys,
> 
> pcs shows no errors after I did standby node, but pcs shows 
> resources still are being ran on the node I just stoodby.
> Is this normal?
> 
> 0.9.152 @C7.3
> thanks
> P.
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 





___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] in standby but still running resources..

2017-04-27 Thread lejeczek

.. is this ok?

hi guys,

pcs shows no errors after I did standby node, but pcs shows 
resources still are being ran on the node I just stoodby.

Is this normal?

0.9.152 @C7.3
thanks
P.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Coming in Pacemaker 1.1.17: start a node in standby

2017-04-27 Thread Jehan-Guillaume de Rorthais
On Tue, 25 Apr 2017 10:33:13 +0200
Lars Ellenberg  wrote:

> On Tue, Apr 25, 2017 at 10:27:43AM +0200, Jehan-Guillaume de Rorthais wrote:
> > On Tue, 25 Apr 2017 10:02:21 +0200
> > Lars Ellenberg  wrote:
> >   
> > > On Mon, Apr 24, 2017 at 03:08:55PM -0500, Ken Gaillot wrote:  
> > > > Hi all,
> > > > 
> > > > Pacemaker 1.1.17 will have a feature that people have occasionally asked
> > > > for in the past: the ability to start a node in standby mode.
> > > 
> > > 
> > > I seem to remember that at some deployment,
> > > we set the node instance attribute standby=on, always,
> > > and took it out of standby using the node_state transient_attribute :-)
> > > 
> > > As in
> > > # crm node standby ava
> > >   
> > > 
> > >   
> > >   ...
> > > 
> > >   
> > >   ...  
> > 
> > This solution seems much more elegant and obvious to me. A cli
> > (crm_standby?) interface would be ideal.
> > 
> > It feels weird to mix setup interfaces (through crm_standby or through the
> > config file) to manipulate the same node attribute. Isn't it possible to set
> > the standby instance attribute of a node **before** it is added to the
> > cluster? 
> > > # crm node status-attr ava set standby off
> > >> > crm-debug-origin="do_update_resource" join="member" expected="member"> ...
> > > 
> > >   
> > >   ...
> > > 
> > >   
> > > 
> > > 
> > 
> > It is not really straight forward to understand why you need to edit a
> > second different nvpair to exit the standby mode... :/  
> 
> Well, you want the "persistent" setting "on",
> and override it with a "transient" setting "off".

Quick questions:

  * is it what happen in the CIB when you call crm_standby?
  * is it possible to do the opposite? persistent setting "off" and override it
with the transient setting?

> That's how to do it in pacemaker.

OK

> But yes, what exactly has ever been "obvious" in pacemaker,
> before you knew?  :-)(or HA in general, to be fair)

Sure, the more I dig, the more I learn about it...


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org