Re: [ClusterLabs] Opt-in cluster shows resources stopped where no nodes should be considered

2016-03-04 Thread Ken Gaillot
On 03/04/2016 04:51 AM, Martin Schlegel wrote:
> Hello all
> 
> While our cluster seems to be working just fine I have noticed something in 
> the
> crm_mon output that I don't quite understand and that is throwing off my
> monitoring a bit as stopped resources could mean something is wrong. I was
> hoping somebody could help me to understand what it means. It seems this might
> have something to do with the fact I am using remote nodes, but I cannot wrap 
> my
> head around it.
> 
> What I am seeing are 3 additional, unexpected lines in the crm_mon -1rR output
> listing my "p_pgcPgbouncer_test" resources as stopped even though there should
> not be any more nodes to be considered in my mind (opt-in cluster, see 
> location
> rules). At the same time this is not happening to my p_pgsqln resources as 
> shown
> at the top of the crm_mon output.

There are two things to look at here: the crm_mon options, and the
clone-max property.

-r means "show inactive resources", and -R means "show more detail". For
clones, this will show all clone instances individually, even if they
can't currently run anywhere due to a constraint. Don't use those
options if you don't want to see that level of detail.

clone-max defaults to the number of nodes. I'm guessing you let it
default, so pacemaker will actually prepare 5 clone instances, even
though only 2 of them can run under current conditions. Setting
clone-max=2 on the clone resource would make the other instances go away.

> The important crm_mon -1rR output lines further below are marked with arrows 
> ->
>   <---.  
> 
> 
> Some background on the policy:
> We are running an asymmetric / opt-in cluster (property 
> symmetric-cluster=false.
> 
> 
> The cluster's main purpose is to take care of a 3+-nodes replicating master /
> slave database running strictly on nodes pg1, pg2 and pg3 per location rule
> l_pgs_resources.
> 
> We also have 2 remote nodes pagalog1 & pgalog2 defined to control database
> connection pooler resources (p_pgcPgbouncer_test) to facilitate client
> connection reroute as per location rule l_pgc_resources.
> 
> 
> crm_mon -1rR output:
> 
> Last updated: Fri Mar  4 09:56:02 2016  Last change: Fri Mar  4 
> 09:55:47
> 2016 by root via cibadmin on pg1
> Stack: corosync
> Current DC: pg1 (1) (version 1.1.14-70404b0) - partition with quorum
> 5 nodes and 29 resources configured
> 
> Online: [ pg1 (1) pg2 (2) pg3 (3) ]
> RemoteOnline: [ pgalog1 pgalog2 ]
> 
> Full list of resources:
> 
>  Master/Slave Set: ms_pgsqln [p_pgsqln]
>   
>  
>  p_pgsqln   (ocf::heartbeat:pgsqln):Master pg3
>   
> 
>  p_pgsqln   (ocf::heartbeat:pgsqln):Started pg1
>   
>   
>  
>  p_pgsqln   (ocf::heartbeat:pgsqln):Started pg2
> -> NO additional lines here <---
>  Masters: [ pg3 ]
>  Stopped: [ pg1 pg2 ]
> [...]
>  pgalog1(ocf::pacemaker:remote):Started pg1
>  pgalog2(ocf::pacemaker:remote):Started pg3
>  Clone Set: cl_pgcPgbouncer [p_pgcPgbouncer_test]
>  p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Started 
> pgalog1
>  p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Started 
> pgalog2
> ->   p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Stopped
> <
> ->   p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Stopped
> <
> ->   p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Stopped
> <
>  Started: [ pgalog1 pgalog2 ]
> 
> 
> 
> Here are the most important parts of the configuration as shown in "crm
> configure show":
> 
> [...]
> primitive pgalog1 ocf:pacemaker:remote \
>   params server=pgalog1 port=3121 \
>   meta target-role=Started
> primitive pgalog2 ocf:pacemaker:remote \
>   params server=pgalog2 port=3121 \
>   meta target-role=Started
> [...]
> location l_pgc_resources { cl_pgcPgbouncer } resource-discovery=exclusive \
>   rule #uname eq pgalog1 \
>   rule #uname eq pgalog2
> 
> location l_pgs_resources { cl_pgsServices1 ms_pgsqln p_pgsBackupjob pgalog1
> pgalog2 } resource-discovery=exclusive \
>   rule #uname eq pg1 \
>   rule #uname eq pg2 \
>   rule #uname eq pg3
> 
> [...]
> property cib-bootstrap-options: \
>   symmetric-cluster=false \
> [...]
> 
> 
> Regards,
> Martin Schlegel


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Clus

[ClusterLabs] Opt-in cluster shows resources stopped where no nodes should be considered

2016-03-04 Thread Martin Schlegel
Hello all

While our cluster seems to be working just fine I have noticed something in the
crm_mon output that I don't quite understand and that is throwing off my
monitoring a bit as stopped resources could mean something is wrong. I was
hoping somebody could help me to understand what it means. It seems this might
have something to do with the fact I am using remote nodes, but I cannot wrap my
head around it.

What I am seeing are 3 additional, unexpected lines in the crm_mon -1rR output
listing my "p_pgcPgbouncer_test" resources as stopped even though there should
not be any more nodes to be considered in my mind (opt-in cluster, see location
rules). At the same time this is not happening to my p_pgsqln resources as shown
at the top of the crm_mon output.

The important crm_mon -1rR output lines further below are marked with arrows ->
  <---.  


Some background on the policy:
We are running an asymmetric / opt-in cluster (property symmetric-cluster=false.


The cluster's main purpose is to take care of a 3+-nodes replicating master /
slave database running strictly on nodes pg1, pg2 and pg3 per location rule
l_pgs_resources.

We also have 2 remote nodes pagalog1 & pgalog2 defined to control database
connection pooler resources (p_pgcPgbouncer_test) to facilitate client
connection reroute as per location rule l_pgc_resources.


crm_mon -1rR output:

Last updated: Fri Mar  4 09:56:02 2016  Last change: Fri Mar  4 09:55:47
2016 by root via cibadmin on pg1
Stack: corosync
Current DC: pg1 (1) (version 1.1.14-70404b0) - partition with quorum
5 nodes and 29 resources configured

Online: [ pg1 (1) pg2 (2) pg3 (3) ]
RemoteOnline: [ pgalog1 pgalog2 ]

Full list of resources:

 Master/Slave Set: ms_pgsqln [p_pgsqln]

   
 p_pgsqln   (ocf::heartbeat:pgsqln):Master pg3

  
 p_pgsqln   (ocf::heartbeat:pgsqln):Started pg1

 
 p_pgsqln   (ocf::heartbeat:pgsqln):Started pg2
-> NO additional lines here <---
 Masters: [ pg3 ]
 Stopped: [ pg1 pg2 ]
[...]
 pgalog1(ocf::pacemaker:remote):Started pg1
 pgalog2(ocf::pacemaker:remote):Started pg3
 Clone Set: cl_pgcPgbouncer [p_pgcPgbouncer_test]
 p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Started pgalog1
 p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Started pgalog2
->   p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Stopped
<
->   p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Stopped
<
->   p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Stopped
<
 Started: [ pgalog1 pgalog2 ]



Here are the most important parts of the configuration as shown in "crm
configure show":

[...]
primitive pgalog1 ocf:pacemaker:remote \
params server=pgalog1 port=3121 \
meta target-role=Started
primitive pgalog2 ocf:pacemaker:remote \
params server=pgalog2 port=3121 \
meta target-role=Started
[...]
location l_pgc_resources { cl_pgcPgbouncer } resource-discovery=exclusive \
rule #uname eq pgalog1 \
rule #uname eq pgalog2

location l_pgs_resources { cl_pgsServices1 ms_pgsqln p_pgsBackupjob pgalog1
pgalog2 } resource-discovery=exclusive \
rule #uname eq pg1 \
rule #uname eq pg2 \
rule #uname eq pg3

[...]
property cib-bootstrap-options: \
symmetric-cluster=false \
[...]


Regards,
Martin Schlegel

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Removing node from pacemaker.

2016-03-04 Thread Andrei Maruha
I have tried it on my cluster, "crm node delete" just removes node from 
the cib without updating of corosync.conf.


After restart of pacemaker service you will get something like this:
Online: [ node1 ]
OFFLINE: [ node2 ]


BTW, you will get the same state after "pacemaker restart", if you 
remove a node from corosync.conf and do not call "crm corosync reload".


On 03/04/2016 12:07 PM, Dejan Muhamedagic wrote:

Hi,

On Thu, Mar 03, 2016 at 03:20:56PM +0300, Andrei Maruha wrote:

Hi,
Usually I use the following steps to delete node from the cluster:
1. #crm corosync del-node 
2. #crm_node -R node --force
3. #crm corosync reload

I'd expect all this to be wrapped in "crm node delete". Isn't
that the case?

Also, is "corosync reload" really required after node removal?

Thanks,

Dejan


Instead of steps 1 and 2you can delete certain node from the
corosync config manually and run:
#corosync-cfgtool -R

On 03/03/2016 02:44 PM, Somanath Jeeva wrote:

Hi,

I am trying to remove a node from the pacemaker’/corosync cluster,
using the command “crm_node -R dl360x4061 –force”.

Though this command removes the node from the cluster, it is
appearing as offline after pacemaker/corosync restart in the nodes
that are online.

Is there any other command to completely delete the node from the
pacemaker/corosync cluster.

Pacemaker and Corosync Versions.

PACEMAKER=1.1.10

COROSYNC=1.4.1

Regards

Somanath Thilak J



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Removing node from pacemaker.

2016-03-04 Thread Dejan Muhamedagic
Hi,

On Thu, Mar 03, 2016 at 03:20:56PM +0300, Andrei Maruha wrote:
> Hi,
> Usually I use the following steps to delete node from the cluster:
> 1. #crm corosync del-node 
> 2. #crm_node -R node --force
> 3. #crm corosync reload

I'd expect all this to be wrapped in "crm node delete". Isn't
that the case?

Also, is "corosync reload" really required after node removal?

Thanks,

Dejan

> Instead of steps 1 and 2you can delete certain node from the
> corosync config manually and run:
> #corosync-cfgtool -R
> 
> On 03/03/2016 02:44 PM, Somanath Jeeva wrote:
> >
> >Hi,
> >
> >I am trying to remove a node from the pacemaker’/corosync cluster,
> >using the command “crm_node -R dl360x4061 –force”.
> >
> >Though this command removes the node from the cluster, it is
> >appearing as offline after pacemaker/corosync restart in the nodes
> >that are online.
> >
> >Is there any other command to completely delete the node from the
> >pacemaker/corosync cluster.
> >
> >Pacemaker and Corosync Versions.
> >
> >PACEMAKER=1.1.10
> >
> >COROSYNC=1.4.1
> >
> >Regards
> >
> >Somanath Thilak J
> >
> >
> >
> >___
> >Users mailing list: Users@clusterlabs.org
> >http://clusterlabs.org/mailman/listinfo/users
> >
> >Project Home: http://www.clusterlabs.org
> >Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >Bugs: http://bugs.clusterlabs.org
> 

> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Avoid HTML-only please (Was: crm_mon change in behaviour PM 1.1.12 -> 1.1.14: crm_mon -XA filters #health.* node attributes)

2016-03-04 Thread Jan Pokorný
On 03/03/16 17:07 +0100, Martin Schlegel wrote:
> Hello everybody

Welcome Martin,

> This is my first post on this mailing list and I am only using Pacemaker since
> fall 2015 ... please be gentle :-) and I will do the same.

the list would really appreciate if you could make your email client
(be it SW run on your machine or a web-based one) send plain-text
format when addressing it (mixed plain-text + HTML is fine).

For instance, see how your post looks like in the archives:
http://oss.clusterlabs.org/pipermail/users/2016-March/002398.html

Thanks for understanding.

-- 
Jan (Poki)


pgpjpUT6z4rKX.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org