Re: [Pacemaker] Error: cluster is not currently running on this node

Miha Wed, 20 Aug 2014 08:23:14 -0700

tnx!

will do that and let you know.


miha

Dne 8/20/2014 5:03 PM, piše emmanuel segura:

Hi,

You need to give every cluster parameter to the
fence_bladecenter_snmp, so from sp2  you neet to use " Attributes:
action=off ipaddr=172.30.0.2 port=8 community=test login=snmp8
passwd=soft1234", command to use from sp2 for test your fencing
"fence_bladecenter_snmp -a 172.30.0.2 -l snmp8 -p soft1234 -c test -o
status" and if the status is ok, when you will scheduled down time for
your system, you can try to reboot with "fence_bladecenter_snmp -a
172.30.0.2 -l snmp8 -p soft1234 -c test -o reboot"

2014-08-20 16:22 GMT+02:00 Miha <m...@softnet.si>:

ok, will do that. This will not affect sip2?

sorry for my noob question but I must be careful as this is in production ;)

So, "fence_bladecenter_snmp reboot" right?

br
miha

Dne 8/19/2014 11:53 AM, piše emmanuel segura:

sorry,

That was a typo, fixed "try to poweroff sp1 by hand, using the
fence_bladecenter_snmp in your shell"

2014-08-19 11:17 GMT+02:00 Miha <m...@softnet.si>:

hi,

what do you mean by "by had of powweroff sp1"? do power off server sip1?

One thing also bothers me. Why on sip2 cluster service is not running if
still virual ip and etc are all properly running?

tnx
miha


Dne 8/19/2014 9:08 AM, piše emmanuel segura:

Your config look ok, have you tried to use fence_bladecenter_snmp by
had for poweroff sp1?

http://www.linuxcertif.com/man/8/fence_bladecenter_snmp/

2014-08-19 8:05 GMT+02:00 Miha <m...@softnet.si>:

sorry, here is it:

<cluster config_version="9" name="sipproxy">
     <fence_daemon/>
     <clusternodes>
       <clusternode name="sip1" nodeid="1">
         <fence>
           <method name="pcmk-method">
             <device name="pcmk-redirect" port="sip1"/>
           </method>
         </fence>
       </clusternode>
       <clusternode name="sip2" nodeid="2">
         <fence>
           <method name="pcmk-method">
             <device name="pcmk-redirect" port="sip2"/>
           </method>
         </fence>
       </clusternode>
     </clusternodes>
     <cman expected_votes="1" two_node="1"/>
     <fencedevices>
       <fencedevice agent="fence_pcmk" name="pcmk-redirect"/>
     </fencedevices>
     <rm>
       <failoverdomains/>
       <resources/>
     </rm>
</cluster>


br
miha

Dne 8/18/2014 11:33 AM, piše emmanuel segura:

your cman /etc/cluster/cluster.conf ?

2014-08-18 7:08 GMT+02:00 Miha <m...@softnet.si>:

Hi Emmanuel,

this is my config:


Pacemaker Nodes:
     sip1 sip2

Resources:
     Master: ms_drbd_mysql
      Meta Attrs: master-max=1 master-node-max=1 clone-max=2
clone-node-max=1
notify=true
      Resource: p_drbd_mysql (class=ocf provider=linbit type=drbd)
       Attributes: drbd_resource=clusterdb_res
       Operations: monitor interval=29s role=Master
(p_drbd_mysql-monitor-29s)
                   monitor interval=31s role=Slave
(p_drbd_mysql-monitor-31s)
     Group: g_mysql
      Resource: p_fs_mysql (class=ocf provider=heartbeat
type=Filesystem)
       Attributes: device=/dev/drbd0 directory=/var/lib/mysql_drbd
fstype=ext4
       Meta Attrs: target-role=Started
      Resource: p_ip_mysql (class=ocf provider=heartbeat type=IPaddr2)
       Attributes: ip=XXX.XXX.XXX.XXX cidr_netmask=24 nic=eth2
      Resource: p_mysql (class=ocf provider=heartbeat type=mysql)
       Attributes: datadir=/var/lib/mysql_drbd/data/ user=root
group=root
config=/var/lib/mysql_drbd/my.cnf pid=/var/run/mysqld/mysqld.pid
socket=/var/lib/mysql/mysql.sock binary=/usr/bin/mysqld_safe
additional_parameters="--bind-address=212.13.249.55 --user=root"
       Meta Attrs: target-role=Started
       Operations: start interval=0 timeout=120s (p_mysql-start-0)
                   stop interval=0 timeout=120s (p_mysql-stop-0)
                   monitor interval=20s timeout=30s
(p_mysql-monitor-20s)
     Clone: cl_ping
      Meta Attrs: interleave=true
      Resource: p_ping (class=ocf provider=pacemaker type=ping)
       Attributes: name=ping multiplier=1000
host_list=XXX.XXX.XXX.XXXX
       Operations: monitor interval=15s timeout=60s
(p_ping-monitor-15s)
                   start interval=0s timeout=60s (p_ping-start-0s)
                   stop interval=0s (p_ping-stop-0s)
     Resource: opensips (class=lsb type=opensips)
      Meta Attrs: target-role=Started
      Operations: start interval=0 timeout=120 (opensips-start-0)
                  stop interval=0 timeout=120 (opensips-stop-0)

Stonith Devices:
     Resource: fence_sip1 (class=stonith type=fence_bladecenter_snmp)
      Attributes: action=off ipaddr=172.30.0.2 port=8 community=test
login=snmp8
passwd=soft1234
      Meta Attrs: target-role=Started
     Resource: fence_sip2 (class=stonith type=fence_bladecenter_snmp)
      Attributes: action=off ipaddr=172.30.0.2 port=9 community=test1
login=snmp8 passwd=soft1234
      Meta Attrs: target-role=Started
Fencing Levels:

Location Constraints:
      Resource: ms_drbd_mysql
        Constraint: l_drbd_master_on_ping
          Rule: score=-INFINITY role=Master boolean-op=or
(id:l_drbd_master_on_ping-rule)
            Expression: not_defined ping
(id:l_drbd_master_on_ping-expression)
            Expression: ping lte 0 type=number
(id:l_drbd_master_on_ping-expression-0)
Ordering Constraints:
      promote ms_drbd_mysql then start g_mysql (INFINITY)
(id:o_drbd_before_mysql)
      g_mysql then start opensips (INFINITY) (id:opensips_after_mysql)
Colocation Constraints:
      g_mysql with ms_drbd_mysql (INFINITY) (with-rsc-role:Master)
(id:c_mysql_on_drbd)
      opensips with g_mysql (INFINITY) (id:c_opensips_on_mysql)

Cluster Properties:
     cluster-infrastructure: cman
     dc-version: 1.1.10-14.el6-368c726
     no-quorum-policy: ignore
     stonith-enabled: true
Node Attributes:
     sip1: standby=off
     sip2: standby=off


br
miha

Dne 8/14/2014 3:05 PM, piše emmanuel segura:

ncomplete=10, Source=/var/lib/pacemaker/pengine/pe-warn-7.bz2):
Stopped
Jul 03 14:10:51 [2701] sip2       crmd:   notice:
too_many_st_failures:         No devices found in cluster to fence
sip1, giving up

Jul 03 14:10:54 [2697] sip2 stonith-ng:     info: stonith_command:
      Processed st_query reply from sip2: OK (0)
Jul 03 14:10:54 [2697] sip2 stonith-ng:    error: remote_op_done:
      Operation reboot of sip1 by sip2 for
stonith_admin.cman.28299@sip2.94474607: No such device

Jul 03 14:10:54 [2697] sip2 stonith-ng:     info: stonith_command:
      Processed st_notify reply from sip2: OK (0)
Jul 03 14:10:54 [2701] sip2       crmd:   notice:
tengine_stonith_notify:       Peer sip1 was not terminated (reboot)
by
sip2 for sip2: No such device
(ref=94474607-8cd2-410d-bbf7-5bc7df614a50) by client
stonith_admin.cman.28299





:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Sorry for the short answer, have you tested your cluster fencing ?
can
you show your cluster.conf xml?

2014-08-14 14:44 GMT+02:00 Miha <m...@softnet.si>:

emmanuel,

tnx. But how to know why fancing stop working?

br
miha

Dne 8/14/2014 2:35 PM, piše emmanuel segura:

Node sip2: UNCLEAN (offline) is unclean because the cluster
fencing
failed to complete the operation

2014-08-14 14:13 GMT+02:00 Miha <m...@softnet.si>:

hi.

another thing.

On node I pcs is running:
[root@sip1 ~]# pcs status
Cluster name: sipproxy
Last updated: Thu Aug 14 14:13:37 2014
Last change: Sat Feb  1 20:10:48 2014 via crm_attribute on sip1
Stack: cman
Current DC: sip1 - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured
10 Resources configured


Node sip2: UNCLEAN (offline)
Online: [ sip1 ]

Full list of resources:

       Master/Slave Set: ms_drbd_mysql [p_drbd_mysql]
           Masters: [ sip2 ]
           Slaves: [ sip1 ]
       Resource Group: g_mysql
           p_fs_mysql (ocf::heartbeat:Filesystem):    Started sip2
           p_ip_mysql (ocf::heartbeat:IPaddr2):       Started sip2
           p_mysql    (ocf::heartbeat:mysql): Started sip2
       Clone Set: cl_ping [p_ping]
           Started: [ sip1 sip2 ]
       opensips       (lsb:opensips): Stopped
       fence_sip1     (stonith:fence_bladecenter_snmp):
Started
sip2
       fence_sip2     (stonith:fence_bladecenter_snmp):
Started
sip2


[root@sip1 ~]#





Dne 8/14/2014 2:12 PM, piše Miha:

Hi emmanuel,

i think so, what is the best way to check?

Sorry for my noob question, I have confiured this 6 mouths ago
and
everything was working fine till now. Now I need to find out
what
realy
heppend beffor I do something stupid.



tnx

Dne 8/14/2014 1:58 PM, piše emmanuel segura:

are you sure your cluster fencing is working?

2014-08-14 13:40 GMT+02:00 Miha <m...@softnet.si>:

Hi,

I noticed today that I am having some problem with cluster. I
noticed
the
master server is offilne but still virutal ip is assigned to
it
and
all
services are running properly (for production).

If I do this I am getting this notifications:

[root@sip2 cluster]# pcs status
Error: cluster is not currently running on this node
[root@sip2 cluster]# /etc/init.d/corosync status
corosync dead but pid file exists
[root@sip2 cluster]# pcs status
Error: cluster is not currently running on this node
[root@sip2 cluster]#
[root@sip2 cluster]#
[root@sip2 cluster]# tailf fenced.log
Aug 14 13:34:25 fenced cman_get_cluster error -1 112


The main thing is what to do now? Do "pcs start" and hope for
the
best
or
what?

I have pasted log in pastebin: http://pastebin.com/SUp2GcmN

tnx!

miha

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Error: cluster is not currently running on this node

Reply via email to