Re: [Pacemaker] Error: cluster is not currently running on this node

Miha Wed, 20 Aug 2014 07:29:24 -0700

ok, will do that. This will not affect sip2?

sorry for my noob question but I must be careful as this is in production ;)


So, "fence_bladecenter_snmp reboot" right?

br
miha

Dne 8/19/2014 11:53 AM, piše emmanuel segura:

sorry,

That was a typo, fixed "try to poweroff sp1 by hand, using the
fence_bladecenter_snmp in your shell"

2014-08-19 11:17 GMT+02:00 Miha <m...@softnet.si>:

hi,

what do you mean by "by had of powweroff sp1"? do power off server sip1?

One thing also bothers me. Why on sip2 cluster service is not running if
still virual ip and etc are all properly running?

tnx
miha


Dne 8/19/2014 9:08 AM, piše emmanuel segura:

Your config look ok, have you tried to use fence_bladecenter_snmp by
had for poweroff sp1?

http://www.linuxcertif.com/man/8/fence_bladecenter_snmp/

2014-08-19 8:05 GMT+02:00 Miha <m...@softnet.si>:

sorry, here is it:

<cluster config_version="9" name="sipproxy">
    <fence_daemon/>
    <clusternodes>
      <clusternode name="sip1" nodeid="1">
        <fence>
          <method name="pcmk-method">
            <device name="pcmk-redirect" port="sip1"/>
          </method>
        </fence>
      </clusternode>
      <clusternode name="sip2" nodeid="2">
        <fence>
          <method name="pcmk-method">
            <device name="pcmk-redirect" port="sip2"/>
          </method>
        </fence>
      </clusternode>
    </clusternodes>
    <cman expected_votes="1" two_node="1"/>
    <fencedevices>
      <fencedevice agent="fence_pcmk" name="pcmk-redirect"/>
    </fencedevices>
    <rm>
      <failoverdomains/>
      <resources/>
    </rm>
</cluster>


br
miha

Dne 8/18/2014 11:33 AM, piše emmanuel segura:

your cman /etc/cluster/cluster.conf ?

2014-08-18 7:08 GMT+02:00 Miha <m...@softnet.si>:

Hi Emmanuel,

this is my config:


Pacemaker Nodes:
    sip1 sip2

Resources:
    Master: ms_drbd_mysql
     Meta Attrs: master-max=1 master-node-max=1 clone-max=2
clone-node-max=1
notify=true
     Resource: p_drbd_mysql (class=ocf provider=linbit type=drbd)
      Attributes: drbd_resource=clusterdb_res
      Operations: monitor interval=29s role=Master
(p_drbd_mysql-monitor-29s)
                  monitor interval=31s role=Slave
(p_drbd_mysql-monitor-31s)
    Group: g_mysql
     Resource: p_fs_mysql (class=ocf provider=heartbeat type=Filesystem)
      Attributes: device=/dev/drbd0 directory=/var/lib/mysql_drbd
fstype=ext4
      Meta Attrs: target-role=Started
     Resource: p_ip_mysql (class=ocf provider=heartbeat type=IPaddr2)
      Attributes: ip=XXX.XXX.XXX.XXX cidr_netmask=24 nic=eth2
     Resource: p_mysql (class=ocf provider=heartbeat type=mysql)
      Attributes: datadir=/var/lib/mysql_drbd/data/ user=root group=root
config=/var/lib/mysql_drbd/my.cnf pid=/var/run/mysqld/mysqld.pid
socket=/var/lib/mysql/mysql.sock binary=/usr/bin/mysqld_safe
additional_parameters="--bind-address=212.13.249.55 --user=root"
      Meta Attrs: target-role=Started
      Operations: start interval=0 timeout=120s (p_mysql-start-0)
                  stop interval=0 timeout=120s (p_mysql-stop-0)
                  monitor interval=20s timeout=30s (p_mysql-monitor-20s)
    Clone: cl_ping
     Meta Attrs: interleave=true
     Resource: p_ping (class=ocf provider=pacemaker type=ping)
      Attributes: name=ping multiplier=1000 host_list=XXX.XXX.XXX.XXXX
      Operations: monitor interval=15s timeout=60s (p_ping-monitor-15s)
                  start interval=0s timeout=60s (p_ping-start-0s)
                  stop interval=0s (p_ping-stop-0s)
    Resource: opensips (class=lsb type=opensips)
     Meta Attrs: target-role=Started
     Operations: start interval=0 timeout=120 (opensips-start-0)
                 stop interval=0 timeout=120 (opensips-stop-0)

Stonith Devices:
    Resource: fence_sip1 (class=stonith type=fence_bladecenter_snmp)
     Attributes: action=off ipaddr=172.30.0.2 port=8 community=test
login=snmp8
passwd=soft1234
     Meta Attrs: target-role=Started
    Resource: fence_sip2 (class=stonith type=fence_bladecenter_snmp)
     Attributes: action=off ipaddr=172.30.0.2 port=9 community=test1
login=snmp8 passwd=soft1234
     Meta Attrs: target-role=Started
Fencing Levels:

Location Constraints:
     Resource: ms_drbd_mysql
       Constraint: l_drbd_master_on_ping
         Rule: score=-INFINITY role=Master boolean-op=or
(id:l_drbd_master_on_ping-rule)
           Expression: not_defined ping
(id:l_drbd_master_on_ping-expression)
           Expression: ping lte 0 type=number
(id:l_drbd_master_on_ping-expression-0)
Ordering Constraints:
     promote ms_drbd_mysql then start g_mysql (INFINITY)
(id:o_drbd_before_mysql)
     g_mysql then start opensips (INFINITY) (id:opensips_after_mysql)
Colocation Constraints:
     g_mysql with ms_drbd_mysql (INFINITY) (with-rsc-role:Master)
(id:c_mysql_on_drbd)
     opensips with g_mysql (INFINITY) (id:c_opensips_on_mysql)

Cluster Properties:
    cluster-infrastructure: cman
    dc-version: 1.1.10-14.el6-368c726
    no-quorum-policy: ignore
    stonith-enabled: true
Node Attributes:
    sip1: standby=off
    sip2: standby=off


br
miha

Dne 8/14/2014 3:05 PM, piše emmanuel segura:

ncomplete=10, Source=/var/lib/pacemaker/pengine/pe-warn-7.bz2):
Stopped
Jul 03 14:10:51 [2701] sip2       crmd:   notice:
too_many_st_failures:         No devices found in cluster to fence
sip1, giving up

Jul 03 14:10:54 [2697] sip2 stonith-ng:     info: stonith_command:
     Processed st_query reply from sip2: OK (0)
Jul 03 14:10:54 [2697] sip2 stonith-ng:    error: remote_op_done:
     Operation reboot of sip1 by sip2 for
stonith_admin.cman.28299@sip2.94474607: No such device

Jul 03 14:10:54 [2697] sip2 stonith-ng:     info: stonith_command:
     Processed st_notify reply from sip2: OK (0)
Jul 03 14:10:54 [2701] sip2       crmd:   notice:
tengine_stonith_notify:       Peer sip1 was not terminated (reboot) by
sip2 for sip2: No such device
(ref=94474607-8cd2-410d-bbf7-5bc7df614a50) by client
stonith_admin.cman.28299




:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Sorry for the short answer, have you tested your cluster fencing ? can
you show your cluster.conf xml?

2014-08-14 14:44 GMT+02:00 Miha <m...@softnet.si>:

emmanuel,

tnx. But how to know why fancing stop working?

br
miha

Dne 8/14/2014 2:35 PM, piše emmanuel segura:

Node sip2: UNCLEAN (offline) is unclean because the cluster fencing
failed to complete the operation

2014-08-14 14:13 GMT+02:00 Miha <m...@softnet.si>:

hi.

another thing.

On node I pcs is running:
[root@sip1 ~]# pcs status
Cluster name: sipproxy
Last updated: Thu Aug 14 14:13:37 2014
Last change: Sat Feb  1 20:10:48 2014 via crm_attribute on sip1
Stack: cman
Current DC: sip1 - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured
10 Resources configured


Node sip2: UNCLEAN (offline)
Online: [ sip1 ]

Full list of resources:

      Master/Slave Set: ms_drbd_mysql [p_drbd_mysql]
          Masters: [ sip2 ]
          Slaves: [ sip1 ]
      Resource Group: g_mysql
          p_fs_mysql (ocf::heartbeat:Filesystem):    Started sip2
          p_ip_mysql (ocf::heartbeat:IPaddr2):       Started sip2
          p_mysql    (ocf::heartbeat:mysql): Started sip2
      Clone Set: cl_ping [p_ping]
          Started: [ sip1 sip2 ]
      opensips       (lsb:opensips): Stopped
      fence_sip1     (stonith:fence_bladecenter_snmp):       Started
sip2
      fence_sip2     (stonith:fence_bladecenter_snmp):       Started
sip2


[root@sip1 ~]#





Dne 8/14/2014 2:12 PM, piše Miha:

Hi emmanuel,

i think so, what is the best way to check?

Sorry for my noob question, I have confiured this 6 mouths ago and
everything was working fine till now. Now I need to find out what
realy
heppend beffor I do something stupid.



tnx

Dne 8/14/2014 1:58 PM, piše emmanuel segura:

are you sure your cluster fencing is working?

2014-08-14 13:40 GMT+02:00 Miha <m...@softnet.si>:

Hi,

I noticed today that I am having some problem with cluster. I
noticed
the
master server is offilne but still virutal ip is assigned to it
and
all
services are running properly (for production).

If I do this I am getting this notifications:

[root@sip2 cluster]# pcs status
Error: cluster is not currently running on this node
[root@sip2 cluster]# /etc/init.d/corosync status
corosync dead but pid file exists
[root@sip2 cluster]# pcs status
Error: cluster is not currently running on this node
[root@sip2 cluster]#
[root@sip2 cluster]#
[root@sip2 cluster]# tailf fenced.log
Aug 14 13:34:25 fenced cman_get_cluster error -1 112


The main thing is what to do now? Do "pcs start" and hope for
the
best
or
what?

I have pasted log in pastebin: http://pastebin.com/SUp2GcmN

tnx!

miha

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Error: cluster is not currently running on this node

Reply via email to