Re: [ClusterLabs] Opt-in cluster shows resources stopped where no nodes should be considered
On 03/04/2016 04:51 AM, Martin Schlegel wrote: > Hello all > > While our cluster seems to be working just fine I have noticed something in > the > crm_mon output that I don't quite understand and that is throwing off my > monitoring a bit as stopped resources could mean something is wrong. I was > hoping somebody could help me to understand what it means. It seems this might > have something to do with the fact I am using remote nodes, but I cannot wrap > my > head around it. > > What I am seeing are 3 additional, unexpected lines in the crm_mon -1rR output > listing my "p_pgcPgbouncer_test" resources as stopped even though there should > not be any more nodes to be considered in my mind (opt-in cluster, see > location > rules). At the same time this is not happening to my p_pgsqln resources as > shown > at the top of the crm_mon output. There are two things to look at here: the crm_mon options, and the clone-max property. -r means "show inactive resources", and -R means "show more detail". For clones, this will show all clone instances individually, even if they can't currently run anywhere due to a constraint. Don't use those options if you don't want to see that level of detail. clone-max defaults to the number of nodes. I'm guessing you let it default, so pacemaker will actually prepare 5 clone instances, even though only 2 of them can run under current conditions. Setting clone-max=2 on the clone resource would make the other instances go away. > The important crm_mon -1rR output lines further below are marked with arrows > -> > <---. > > > Some background on the policy: > We are running an asymmetric / opt-in cluster (property > symmetric-cluster=false. > > > The cluster's main purpose is to take care of a 3+-nodes replicating master / > slave database running strictly on nodes pg1, pg2 and pg3 per location rule > l_pgs_resources. > > We also have 2 remote nodes pagalog1 & pgalog2 defined to control database > connection pooler resources (p_pgcPgbouncer_test) to facilitate client > connection reroute as per location rule l_pgc_resources. > > > crm_mon -1rR output: > > Last updated: Fri Mar 4 09:56:02 2016 Last change: Fri Mar 4 > 09:55:47 > 2016 by root via cibadmin on pg1 > Stack: corosync > Current DC: pg1 (1) (version 1.1.14-70404b0) - partition with quorum > 5 nodes and 29 resources configured > > Online: [ pg1 (1) pg2 (2) pg3 (3) ] > RemoteOnline: [ pgalog1 pgalog2 ] > > Full list of resources: > > Master/Slave Set: ms_pgsqln [p_pgsqln] > > > p_pgsqln (ocf::heartbeat:pgsqln):Master pg3 > > > p_pgsqln (ocf::heartbeat:pgsqln):Started pg1 > > > > p_pgsqln (ocf::heartbeat:pgsqln):Started pg2 > -> NO additional lines here <--- > Masters: [ pg3 ] > Stopped: [ pg1 pg2 ] > [...] > pgalog1(ocf::pacemaker:remote):Started pg1 > pgalog2(ocf::pacemaker:remote):Started pg3 > Clone Set: cl_pgcPgbouncer [p_pgcPgbouncer_test] > p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Started > pgalog1 > p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Started > pgalog2 > -> p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Stopped > < > -> p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Stopped > < > -> p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Stopped > < > Started: [ pgalog1 pgalog2 ] > > > > Here are the most important parts of the configuration as shown in "crm > configure show": > > [...] > primitive pgalog1 ocf:pacemaker:remote \ > params server=pgalog1 port=3121 \ > meta target-role=Started > primitive pgalog2 ocf:pacemaker:remote \ > params server=pgalog2 port=3121 \ > meta target-role=Started > [...] > location l_pgc_resources { cl_pgcPgbouncer } resource-discovery=exclusive \ > rule #uname eq pgalog1 \ > rule #uname eq pgalog2 > > location l_pgs_resources { cl_pgsServices1 ms_pgsqln p_pgsBackupjob pgalog1 > pgalog2 } resource-discovery=exclusive \ > rule #uname eq pg1 \ > rule #uname eq pg2 \ > rule #uname eq pg3 > > [...] > property cib-bootstrap-options: \ > symmetric-cluster=false \ > [...] > > > Regards, > Martin Schlegel ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Clus
[ClusterLabs] Opt-in cluster shows resources stopped where no nodes should be considered
Hello all While our cluster seems to be working just fine I have noticed something in the crm_mon output that I don't quite understand and that is throwing off my monitoring a bit as stopped resources could mean something is wrong. I was hoping somebody could help me to understand what it means. It seems this might have something to do with the fact I am using remote nodes, but I cannot wrap my head around it. What I am seeing are 3 additional, unexpected lines in the crm_mon -1rR output listing my "p_pgcPgbouncer_test" resources as stopped even though there should not be any more nodes to be considered in my mind (opt-in cluster, see location rules). At the same time this is not happening to my p_pgsqln resources as shown at the top of the crm_mon output. The important crm_mon -1rR output lines further below are marked with arrows -> <---. Some background on the policy: We are running an asymmetric / opt-in cluster (property symmetric-cluster=false. The cluster's main purpose is to take care of a 3+-nodes replicating master / slave database running strictly on nodes pg1, pg2 and pg3 per location rule l_pgs_resources. We also have 2 remote nodes pagalog1 & pgalog2 defined to control database connection pooler resources (p_pgcPgbouncer_test) to facilitate client connection reroute as per location rule l_pgc_resources. crm_mon -1rR output: Last updated: Fri Mar 4 09:56:02 2016 Last change: Fri Mar 4 09:55:47 2016 by root via cibadmin on pg1 Stack: corosync Current DC: pg1 (1) (version 1.1.14-70404b0) - partition with quorum 5 nodes and 29 resources configured Online: [ pg1 (1) pg2 (2) pg3 (3) ] RemoteOnline: [ pgalog1 pgalog2 ] Full list of resources: Master/Slave Set: ms_pgsqln [p_pgsqln] p_pgsqln (ocf::heartbeat:pgsqln):Master pg3 p_pgsqln (ocf::heartbeat:pgsqln):Started pg1 p_pgsqln (ocf::heartbeat:pgsqln):Started pg2 -> NO additional lines here <--- Masters: [ pg3 ] Stopped: [ pg1 pg2 ] [...] pgalog1(ocf::pacemaker:remote):Started pg1 pgalog2(ocf::pacemaker:remote):Started pg3 Clone Set: cl_pgcPgbouncer [p_pgcPgbouncer_test] p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Started pgalog1 p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Started pgalog2 -> p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Stopped < -> p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Stopped < -> p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Stopped < Started: [ pgalog1 pgalog2 ] Here are the most important parts of the configuration as shown in "crm configure show": [...] primitive pgalog1 ocf:pacemaker:remote \ params server=pgalog1 port=3121 \ meta target-role=Started primitive pgalog2 ocf:pacemaker:remote \ params server=pgalog2 port=3121 \ meta target-role=Started [...] location l_pgc_resources { cl_pgcPgbouncer } resource-discovery=exclusive \ rule #uname eq pgalog1 \ rule #uname eq pgalog2 location l_pgs_resources { cl_pgsServices1 ms_pgsqln p_pgsBackupjob pgalog1 pgalog2 } resource-discovery=exclusive \ rule #uname eq pg1 \ rule #uname eq pg2 \ rule #uname eq pg3 [...] property cib-bootstrap-options: \ symmetric-cluster=false \ [...] Regards, Martin Schlegel ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Removing node from pacemaker.
I have tried it on my cluster, "crm node delete" just removes node from the cib without updating of corosync.conf. After restart of pacemaker service you will get something like this: Online: [ node1 ] OFFLINE: [ node2 ] BTW, you will get the same state after "pacemaker restart", if you remove a node from corosync.conf and do not call "crm corosync reload". On 03/04/2016 12:07 PM, Dejan Muhamedagic wrote: Hi, On Thu, Mar 03, 2016 at 03:20:56PM +0300, Andrei Maruha wrote: Hi, Usually I use the following steps to delete node from the cluster: 1. #crm corosync del-node 2. #crm_node -R node --force 3. #crm corosync reload I'd expect all this to be wrapped in "crm node delete". Isn't that the case? Also, is "corosync reload" really required after node removal? Thanks, Dejan Instead of steps 1 and 2you can delete certain node from the corosync config manually and run: #corosync-cfgtool -R On 03/03/2016 02:44 PM, Somanath Jeeva wrote: Hi, I am trying to remove a node from the pacemaker’/corosync cluster, using the command “crm_node -R dl360x4061 –force”. Though this command removes the node from the cluster, it is appearing as offline after pacemaker/corosync restart in the nodes that are online. Is there any other command to completely delete the node from the pacemaker/corosync cluster. Pacemaker and Corosync Versions. PACEMAKER=1.1.10 COROSYNC=1.4.1 Regards Somanath Thilak J ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Removing node from pacemaker.
Hi, On Thu, Mar 03, 2016 at 03:20:56PM +0300, Andrei Maruha wrote: > Hi, > Usually I use the following steps to delete node from the cluster: > 1. #crm corosync del-node > 2. #crm_node -R node --force > 3. #crm corosync reload I'd expect all this to be wrapped in "crm node delete". Isn't that the case? Also, is "corosync reload" really required after node removal? Thanks, Dejan > Instead of steps 1 and 2you can delete certain node from the > corosync config manually and run: > #corosync-cfgtool -R > > On 03/03/2016 02:44 PM, Somanath Jeeva wrote: > > > >Hi, > > > >I am trying to remove a node from the pacemaker’/corosync cluster, > >using the command “crm_node -R dl360x4061 –force”. > > > >Though this command removes the node from the cluster, it is > >appearing as offline after pacemaker/corosync restart in the nodes > >that are online. > > > >Is there any other command to completely delete the node from the > >pacemaker/corosync cluster. > > > >Pacemaker and Corosync Versions. > > > >PACEMAKER=1.1.10 > > > >COROSYNC=1.4.1 > > > >Regards > > > >Somanath Thilak J > > > > > > > >___ > >Users mailing list: Users@clusterlabs.org > >http://clusterlabs.org/mailman/listinfo/users > > > >Project Home: http://www.clusterlabs.org > >Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >Bugs: http://bugs.clusterlabs.org > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Avoid HTML-only please (Was: crm_mon change in behaviour PM 1.1.12 -> 1.1.14: crm_mon -XA filters #health.* node attributes)
On 03/03/16 17:07 +0100, Martin Schlegel wrote: > Hello everybody Welcome Martin, > This is my first post on this mailing list and I am only using Pacemaker since > fall 2015 ... please be gentle :-) and I will do the same. the list would really appreciate if you could make your email client (be it SW run on your machine or a web-based one) send plain-text format when addressing it (mixed plain-text + HTML is fine). For instance, see how your post looks like in the archives: http://oss.clusterlabs.org/pipermail/users/2016-March/002398.html Thanks for understanding. -- Jan (Poki) pgpjpUT6z4rKX.pgp Description: PGP signature ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org