Re: [Pacemaker] Node fails to rejoin cluster

2013-02-14 Thread Proskurin Kirill
On 02/08/2013 04:59 AM, Andrew Beekhof wrote: Suggests it s a bug that got fixed recently. Keep an eye out for 1.1.9 in the next week or so (or you could try building from source if you're in a hurry). Is 1.1.9 will be centos 5.x friendly? -- Best regards, Proskurin K

Re: [Pacemaker] Periodically appear non-existent nodes

2012-04-17 Thread Proskurin Kirill
NODENAME crm_node --force --remove NODENAME cibadmin --delete --obj_type nodes --crm_xml '' cibadmin --delete --obj_type status --crm_xml 'uname="NODENAME"/>' -- Best regards, Proskurin Kirill ___ Pacemaker mailing list:

[Pacemaker] cib not connected

2011-10-24 Thread Proskurin Kirill
eat/crm/* , startup all nodes. But it`s not really an option. :-) -- Best regards, Proskurin Kirill ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://w

Re: [Pacemaker] Questions about reasonable cluster size...

2011-10-20 Thread Proskurin Kirill
not, what is the current recommendation for maximum number of nodes and resources? I start to have problems with 10+ nodes. It`s heavly depended on corosync configuration afaik. You should test it. -- Best regards, Proskurin Kirill ___ Pacemaker

Re: [Pacemaker] 1) attrd, crmd, cib, stonithd going to 100% CPU after standby 2) monitoring bug 3) meta failure-timeout issue

2011-10-17 Thread Proskurin Kirill
stonith. Each time I do: crm resource stop test-kill-15.pl And in case 1 and 2 - I get "unmanaged" on this resource. Because you've not configured any fencing devices. -- Best regards, Proskurin Kirill ___ Pacemaker ma

Re: [Pacemaker] 1) attrd, crmd, cib, stonithd going to 100% CPU after standby 2) monitoring bug 3) meta failure-timeout issue

2011-10-05 Thread Proskurin Kirill
On 10/05/2011 04:19 AM, Andrew Beekhof wrote: On Mon, Oct 3, 2011 at 5:50 PM, Proskurin Kirill wrote: On 10/03/2011 05:32 AM, Andrew Beekhof wrote: corosync-1.4.1 pacemaker-1.1.5 pacemaker runs with "ver: 1" 2) This one is scary. I twice run on situation then pacemaker t

Re: [Pacemaker] 1) attrd, crmd, cib, stonithd going to 100% CPU after standby 2) monitoring bug 3) meta failure-timeout issue

2011-10-05 Thread Proskurin Kirill
On 10/05/2011 04:19 AM, Andrew Beekhof wrote: On Mon, Oct 3, 2011 at 5:50 PM, Proskurin Kirill wrote: On 10/03/2011 05:32 AM, Andrew Beekhof wrote: corosync-1.4.1 pacemaker-1.1.5 pacemaker runs with "ver: 1" 2) This one is scary. I twice run on situation then pacemaker t

Re: [Pacemaker] 1) attrd, crmd, cib, stonithd going to 100% CPU after standby 2) monitoring bug 3) meta failure-timeout issue

2011-10-02 Thread Proskurin Kirill
set the target-role to Stopped. No, I want to use failure-timeout but not wipe out errors then resource are already stopped by pacemaker because of errors and not by admin hands. -- Best regards, Proskurin Kirill ___ Pacemaker mailing list: Pacemaker@oss

[Pacemaker] Ignoring expired failure

2011-09-30 Thread Proskurin Kirill
0:4c16dc39-1fd3-41f2-b582-0236f6b6eccc) on mysender34.mail.ru So Pacemaker knows what resource may be down but ignoring it. Why? -- Best regards, Proskurin Kirill ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailma

[Pacemaker] 1) attrd, crmd, cib, stonithd going to 100% CPU after standby 2) monitoring bug 3) meta failure-timeout issue

2011-09-29 Thread Proskurin Kirill
e don`t know if it is stopped by a hands of admin or because of errors. I think what failure-timeout should not happend on stopped resource. Any chance to avoid this? -- Best regards, Proskurin Kirill #!/bin/sh ### #

Re: [Pacemaker] Cluster type is: corosync

2011-08-01 Thread Proskurin Kirill
02.08.2011 1:00, Andrew Beekhof пишет: On Mon, Aug 1, 2011 at 10:23 PM, Proskurin Kirill wrote: 01.08.2011 5:42, Andrew Beekhof пишет: "Finally, tell Corosync to load the Pacemaker plugin."n As I said before: "And I run pacemakerd after corosync start." The

Re: [Pacemaker] Cluster type is: corosync

2011-08-01 Thread Proskurin Kirill
01.08.2011 5:42, Andrew Beekhof пишет: "Finally, tell Corosync to load the Pacemaker plugin."n As I said before: "And I run pacemakerd after corosync start." Anyway - problem is solved for me. -- Best regards, Proskurin Kirill _

Re: [Pacemaker] Upgrading from 1.0 to 1.1

2011-07-27 Thread Proskurin Kirill
27.07.2011 5:56, Andrew Beekhof пишет: On Tue, Jul 19, 2011 at 5:40 PM, Proskurin Kirill wrote: On 07/19/2011 03:22 AM, Andrew Beekhof wrote: On Fri, Jul 15, 2011 at 10:33 PM, Proskurin Kirill wrote: Hello all. I found what I using corosync with pacemaker "ver:0" with

Re: [Pacemaker] Cluster type is: corosync

2011-07-27 Thread Proskurin Kirill
ais" \ expected-quorum-votes="6" Offline nodes(Cluster type is: corosync) [root@mysender2 ~]# crm configure show [root@mysender2 ~]# pacemaker-1.1.5 corosync-1.4.0 cluster-glue-1.0.6 openais-1.1.2 All nodes have same rpms. On Fri, Jul 22, 2011 at 7:47 PM, Proskurin K

Re: [Pacemaker] Cluster type is: corosync

2011-07-26 Thread Proskurin Kirill
On 07/26/2011 11:00 AM, Andrew Beekhof wrote: On Mon, Jul 25, 2011 at 7:18 PM, Proskurin Kirill wrote: 25.07.2011 10:10, Andrew Beekhof пишет: Which packages are you using? It is your official source from repository I build. Ok. And did you add the pacemaker configuration options to

Re: [Pacemaker] Cluster type is: corosync

2011-07-25 Thread Proskurin Kirill
Hello. I update openais to latest 1.1.4 but this not helps at all. Google know nothing about it. I run of ideas. 25.07.2011 13:18, Proskurin Kirill пишет: 25.07.2011 10:10, Andrew Beekhof пишет: Which packages are you using? It is your official source from repository I build. pacemaker

Re: [Pacemaker] Cluster type is: corosync

2011-07-25 Thread Proskurin Kirill
25.07.2011 10:10, Andrew Beekhof пишет: Which packages are you using? It is your official source from repository I build. pacemaker-1.1.5 corosync-1.4.0 cluster-glue-1.0.6 openais-1.1.2 All nodes have same rpms. On Fri, Jul 22, 2011 at 7:47 PM, Proskurin Kirill wrote: Hello again! Hope

Re: [Pacemaker] Sending message via cpg FAILED: (rc=12) Doesn't exist

2011-07-22 Thread Proskurin Kirill
22.07.2011 20:30, Steven Dake пишет: On 07/22/2011 01:15 AM, Proskurin Kirill wrote: Hello all. pacemaker-1.1.5 corosync-1.4.0 11:50:07 corosync [TOTEM ] Retransmit List: e4 e5 e7 e8 ea eb ed ee Jul 22 11:50:07 corosync [TOTEM ] Retransmit List: e4 e5 e7 e8 ea eb ed ee There is a problem

[Pacemaker] Cluster type is: corosync

2011-07-22 Thread Proskurin Kirill
e: Cluster type is: 'corosync'. Jul 22 13:39:17 mysender2.mail.ru cib: [9029]: info: get_cluster_type: Cluster type is: 'corosync'. Jul 22 13:39:18 mysender2.mail.ru crmd: [9033]: info: get_cluster_type: Cluster type is: 'corosync'. What`s wrong and how can I f

[Pacemaker] Sending message via cpg FAILED: (rc=12) Doesn't exist

2011-07-22 Thread Proskurin Kirill
0 state=unknown addr=(null) votes=0 born=0 seen=0 proc=000 2 (new) Jul 22 11:50:07 corosync [TOTEM ] Retransmit List: e4 e5 e7 e8 ea eb ed ee Jul 22 11:50:07 corosync [TOTEM ] Retransmit List: e4 e5 e7 e8 ea eb ed ee There is a problem? -- Best regards, P

Re: [Pacemaker] Upgrading from 1.0 to 1.1

2011-07-19 Thread Proskurin Kirill
On 07/19/2011 03:22 AM, Andrew Beekhof wrote: On Fri, Jul 15, 2011 at 10:33 PM, Proskurin Kirill wrote: Hello all. I found what I using corosync with pacemaker "ver:0" with installed pacemaker 1.1.5 - eg without start a pacemakerd. Sounds wrong. :-) So I try to upgrade. I shutdow

[Pacemaker] Upgrading from 1.0 to 1.1

2011-07-15 Thread Proskurin Kirill
this node stays online and on clusters DC I see: cib: [18392]: WARN: cib_peer_callback: Discarding cib_sync_one message (255) from mysender10.example.com: not in our membership Is there is a way to upgrade all nodes one by one without shutdown all cluster? -- Best regards, Prosku

Re: [Pacemaker] Timeout, interval & onfail questions

2011-07-11 Thread Proskurin Kirill
I will use some cron job for this. Fix it so that it doesn't fail; if it fails due to a too short timeout, make the timeout longer. Sad thing - this host have huge LA time by time and we can`t fix that in near future. Timeout not really helps here(3m by now)... well I don`t really t

[Pacemaker] Timeout, interval & onfail questions

2011-07-09 Thread Proskurin Kirill
e is fail - pacemaker will try to run "stop" action but because of high LA it will timeout too and pacemaker decide what resource is "unmanaged". How can I tune this behaviour? I wish pacemaker not to give up and try again. --

Re: [Pacemaker] SNMP monitoring

2011-07-06 Thread Proskurin Kirill
d traps directly to it. P.S. This works for me on CentOS 5.x with pacemaker 1.1.5 and snmp-5.3.2. -- Best regards, Proskurin Kirill ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Pr

[Pacemaker] SNMP monitoring

2011-07-04 Thread Proskurin Kirill
into a pacemaker docs(SNMP chapter is empty there)? -- Best regards, Proskurin Kirill ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting

Re: [Pacemaker] Not connected to AIS

2011-06-28 Thread Proskurin Kirill
On 06/27/2011 09:15 AM, Andrew Beekhof wrote: On Fri, Jun 24, 2011 at 6:56 PM, Proskurin Kirill wrote: Hello. I have a strange problem. One node in cluster are not work right. In logs: Jun 23 20:25:25 mysender39.example.com lrmd: [10371]: WARN: For LSB init script, no additional parameters

[Pacemaker] Resource monitor stop working

2011-06-24 Thread Proskurin Kirill
m inf: ClusterIP qm_manager.init property $id="cib-bootstrap-options" \ dc-version="1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87" \ cluster-infrastructure="openais" \ expected-quorum-votes="4" \ stonith-enabled="false"

[Pacemaker] Not connected to AIS

2011-06-24 Thread Proskurin Kirill
5 min but it still "Waiting for corosync services to unload" So i kill with -9 and restart. And all start normal again. What was wrong? Corosync-1.2.7 Pacemaker-1.0.11 -- Best regards, Proskurin Kirill ___ Pacemaker mailing list: Pacemaker

Re: [Pacemaker] Deleted nodes returns

2011-06-22 Thread Proskurin Kirill
On 06/22/2011 03:41 PM, Florian Haas wrote: On 2011-06-22 12:41, Proskurin Kirill wrote: Hello all. I have a strange problem. At the beginning of my cluster there is a nodes called mysender38.i and mysender39.i Then I: Stop them Delete all from /var/lib/heartbeat/crm/* crm_node --force

[Pacemaker] Deleted nodes returns

2011-06-22 Thread Proskurin Kirill
on again to make them disappear. It is a bug or I doing something wrong? pacemaker-1.0.11 corosync-1.2.7 -- Best regards, Proskurin Kirill ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listi

[Pacemaker] Hostname issues

2011-06-21 Thread Proskurin Kirill
for node name and get external name. How to avoide this? I can`t change hostname to int one and can`t run corosync on ext network. -- Best regards, Proskurin Kirill ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org

[Pacemaker] Groups

2011-06-20 Thread Proskurin Kirill
er way what I missed? -- Best regards, Proskurin Kirill ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www

Re: [Pacemaker] FS mount error

2010-07-22 Thread Proskurin Kirill
tion 647 (Complete=4, Pending=0, Fired=0, Skipped=2, Incomplete=0, Source=/var/lib/pengine/pe-input-691.bz2): Stopped Jul 22 09:33:32 node01 crmd: [1814]: info: te_graph_trigger: Transition 647 is now complete -- Best regards, Proskurin Kirill ___ P

[Pacemaker] FS mount error

2010-07-22 Thread Proskurin Kirill
stonithd Jul 22 08:18:46 node01 crmd: [1814]: notice: Not currently connected. Jul 22 08:18:46 node01 crmd: [1814]: ERROR: te_connect_stonith: Sign-in failed: triggered a retry Jul 22 08:18:46 node01 crmd: [1814]: info: te_connect_stonith: Attempting connecti

Re: [Pacemaker] Pacemaker see double node`s

2010-07-14 Thread Proskurin Kirill
tacks Thanks - it is work like a charm. -- Best regards, Proskurin Kirill ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started:

[Pacemaker] Pacemaker see double node`s

2010-07-14 Thread Proskurin Kirill
hey hostname as a white IP but in /etc/hosts I have 192.168.1.1 node01.domain.com node01 192.168.1.2 node02.domain.com node02 That was done for some reasons of other test. There is a problem and how can I fix it? -- Best regards, Proskurin Kirill