Re: [Pacemaker] Error: cluster is not currently running on this node

emmanuel segura Mon, 18 Aug 2014 02:39:07 -0700

your cman /etc/cluster/cluster.conf ?

2014-08-18 7:08 GMT+02:00 Miha <m...@softnet.si>:
> Hi Emmanuel,
>
> this is my config:
>
>
> Pacemaker Nodes:
>  sip1 sip2
>
> Resources:
>  Master: ms_drbd_mysql
>   Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1
> notify=true
>   Resource: p_drbd_mysql (class=ocf provider=linbit type=drbd)
>    Attributes: drbd_resource=clusterdb_res
>    Operations: monitor interval=29s role=Master (p_drbd_mysql-monitor-29s)
>                monitor interval=31s role=Slave (p_drbd_mysql-monitor-31s)
>  Group: g_mysql
>   Resource: p_fs_mysql (class=ocf provider=heartbeat type=Filesystem)
>    Attributes: device=/dev/drbd0 directory=/var/lib/mysql_drbd fstype=ext4
>    Meta Attrs: target-role=Started
>   Resource: p_ip_mysql (class=ocf provider=heartbeat type=IPaddr2)
>    Attributes: ip=XXX.XXX.XXX.XXX cidr_netmask=24 nic=eth2
>   Resource: p_mysql (class=ocf provider=heartbeat type=mysql)
>    Attributes: datadir=/var/lib/mysql_drbd/data/ user=root group=root
> config=/var/lib/mysql_drbd/my.cnf pid=/var/run/mysqld/mysqld.pid
> socket=/var/lib/mysql/mysql.sock binary=/usr/bin/mysqld_safe
> additional_parameters="--bind-address=212.13.249.55 --user=root"
>    Meta Attrs: target-role=Started
>    Operations: start interval=0 timeout=120s (p_mysql-start-0)
>                stop interval=0 timeout=120s (p_mysql-stop-0)
>                monitor interval=20s timeout=30s (p_mysql-monitor-20s)
>  Clone: cl_ping
>   Meta Attrs: interleave=true
>   Resource: p_ping (class=ocf provider=pacemaker type=ping)
>    Attributes: name=ping multiplier=1000 host_list=XXX.XXX.XXX.XXXX
>    Operations: monitor interval=15s timeout=60s (p_ping-monitor-15s)
>                start interval=0s timeout=60s (p_ping-start-0s)
>                stop interval=0s (p_ping-stop-0s)
>  Resource: opensips (class=lsb type=opensips)
>   Meta Attrs: target-role=Started
>   Operations: start interval=0 timeout=120 (opensips-start-0)
>               stop interval=0 timeout=120 (opensips-stop-0)
>
> Stonith Devices:
>  Resource: fence_sip1 (class=stonith type=fence_bladecenter_snmp)
>   Attributes: action=off ipaddr=172.30.0.2 port=8 community=test login=snmp8
> passwd=soft1234
>   Meta Attrs: target-role=Started
>  Resource: fence_sip2 (class=stonith type=fence_bladecenter_snmp)
>   Attributes: action=off ipaddr=172.30.0.2 port=9 community=test1
> login=snmp8 passwd=soft1234
>   Meta Attrs: target-role=Started
> Fencing Levels:
>
> Location Constraints:
>   Resource: ms_drbd_mysql
>     Constraint: l_drbd_master_on_ping
>       Rule: score=-INFINITY role=Master boolean-op=or
> (id:l_drbd_master_on_ping-rule)
>         Expression: not_defined ping (id:l_drbd_master_on_ping-expression)
>         Expression: ping lte 0 type=number
> (id:l_drbd_master_on_ping-expression-0)
> Ordering Constraints:
>   promote ms_drbd_mysql then start g_mysql (INFINITY)
> (id:o_drbd_before_mysql)
>   g_mysql then start opensips (INFINITY) (id:opensips_after_mysql)
> Colocation Constraints:
>   g_mysql with ms_drbd_mysql (INFINITY) (with-rsc-role:Master)
> (id:c_mysql_on_drbd)
>   opensips with g_mysql (INFINITY) (id:c_opensips_on_mysql)
>
> Cluster Properties:
>  cluster-infrastructure: cman
>  dc-version: 1.1.10-14.el6-368c726
>  no-quorum-policy: ignore
>  stonith-enabled: true
> Node Attributes:
>  sip1: standby=off
>  sip2: standby=off
>
>
> br
> miha
>
> Dne 8/14/2014 3:05 PM, piše emmanuel segura:
>
>> ncomplete=10, Source=/var/lib/pacemaker/pengine/pe-warn-7.bz2): Stopped
>> Jul 03 14:10:51 [2701] sip2       crmd:   notice:
>> too_many_st_failures:         No devices found in cluster to fence
>> sip1, giving up
>>
>> Jul 03 14:10:54 [2697] sip2 stonith-ng:     info: stonith_command:
>>   Processed st_query reply from sip2: OK (0)
>> Jul 03 14:10:54 [2697] sip2 stonith-ng:    error: remote_op_done:
>>   Operation reboot of sip1 by sip2 for
>> stonith_admin.cman.28299@sip2.94474607: No such device
>>
>> Jul 03 14:10:54 [2697] sip2 stonith-ng:     info: stonith_command:
>>   Processed st_notify reply from sip2: OK (0)
>> Jul 03 14:10:54 [2701] sip2       crmd:   notice:
>> tengine_stonith_notify:       Peer sip1 was not terminated (reboot) by
>> sip2 for sip2: No such device
>> (ref=94474607-8cd2-410d-bbf7-5bc7df614a50) by client
>> stonith_admin.cman.28299
>>
>>
>> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
>>
>> Sorry for the short answer, have you tested your cluster fencing ? can
>> you show your cluster.conf xml?
>>
>> 2014-08-14 14:44 GMT+02:00 Miha <m...@softnet.si>:
>>>
>>> emmanuel,
>>>
>>> tnx. But how to know why fancing stop working?
>>>
>>> br
>>> miha
>>>
>>> Dne 8/14/2014 2:35 PM, piše emmanuel segura:
>>>
>>>> Node sip2: UNCLEAN (offline) is unclean because the cluster fencing
>>>> failed to complete the operation
>>>>
>>>> 2014-08-14 14:13 GMT+02:00 Miha <m...@softnet.si>:
>>>>>
>>>>> hi.
>>>>>
>>>>> another thing.
>>>>>
>>>>> On node I pcs is running:
>>>>> [root@sip1 ~]# pcs status
>>>>> Cluster name: sipproxy
>>>>> Last updated: Thu Aug 14 14:13:37 2014
>>>>> Last change: Sat Feb  1 20:10:48 2014 via crm_attribute on sip1
>>>>> Stack: cman
>>>>> Current DC: sip1 - partition with quorum
>>>>> Version: 1.1.10-14.el6-368c726
>>>>> 2 Nodes configured
>>>>> 10 Resources configured
>>>>>
>>>>>
>>>>> Node sip2: UNCLEAN (offline)
>>>>> Online: [ sip1 ]
>>>>>
>>>>> Full list of resources:
>>>>>
>>>>>    Master/Slave Set: ms_drbd_mysql [p_drbd_mysql]
>>>>>        Masters: [ sip2 ]
>>>>>        Slaves: [ sip1 ]
>>>>>    Resource Group: g_mysql
>>>>>        p_fs_mysql (ocf::heartbeat:Filesystem):    Started sip2
>>>>>        p_ip_mysql (ocf::heartbeat:IPaddr2):       Started sip2
>>>>>        p_mysql    (ocf::heartbeat:mysql): Started sip2
>>>>>    Clone Set: cl_ping [p_ping]
>>>>>        Started: [ sip1 sip2 ]
>>>>>    opensips       (lsb:opensips): Stopped
>>>>>    fence_sip1     (stonith:fence_bladecenter_snmp):       Started sip2
>>>>>    fence_sip2     (stonith:fence_bladecenter_snmp):       Started sip2
>>>>>
>>>>>
>>>>> [root@sip1 ~]#
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Dne 8/14/2014 2:12 PM, piše Miha:
>>>>>
>>>>>> Hi emmanuel,
>>>>>>
>>>>>> i think so, what is the best way to check?
>>>>>>
>>>>>> Sorry for my noob question, I have confiured this 6 mouths ago and
>>>>>> everything was working fine till now. Now I need to find out what
>>>>>> realy
>>>>>> heppend beffor I do something stupid.
>>>>>>
>>>>>>
>>>>>>
>>>>>> tnx
>>>>>>
>>>>>> Dne 8/14/2014 1:58 PM, piše emmanuel segura:
>>>>>>>
>>>>>>> are you sure your cluster fencing is working?
>>>>>>>
>>>>>>> 2014-08-14 13:40 GMT+02:00 Miha <m...@softnet.si>:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I noticed today that I am having some problem with cluster. I
>>>>>>>> noticed
>>>>>>>> the
>>>>>>>> master server is offilne but still virutal ip is assigned to it and
>>>>>>>> all
>>>>>>>> services are running properly (for production).
>>>>>>>>
>>>>>>>> If I do this I am getting this notifications:
>>>>>>>>
>>>>>>>> [root@sip2 cluster]# pcs status
>>>>>>>> Error: cluster is not currently running on this node
>>>>>>>> [root@sip2 cluster]# /etc/init.d/corosync status
>>>>>>>> corosync dead but pid file exists
>>>>>>>> [root@sip2 cluster]# pcs status
>>>>>>>> Error: cluster is not currently running on this node
>>>>>>>> [root@sip2 cluster]#
>>>>>>>> [root@sip2 cluster]#
>>>>>>>> [root@sip2 cluster]# tailf fenced.log
>>>>>>>> Aug 14 13:34:25 fenced cman_get_cluster error -1 112
>>>>>>>>
>>>>>>>>
>>>>>>>> The main thing is what to do now? Do "pcs start" and hope for the
>>>>>>>> best
>>>>>>>> or
>>>>>>>> what?
>>>>>>>>
>>>>>>>> I have pasted log in pastebin: http://pastebin.com/SUp2GcmN
>>>>>>>>
>>>>>>>> tnx!
>>>>>>>>
>>>>>>>> miha
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>
>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>> Getting started:
>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>
>>>>>>>
>>>>>>>
>>>>> _______________________________________________
>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started:
>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org




-- 
esta es mi vida e me la vivo hasta que dios quiera

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Error: cluster is not currently running on this node

Reply via email to