Re: [Linux-HA] Pacemaker 1.19 cannot manage more than 127 resources

2013-08-29 Thread Tom Parker
Thanks for your help. I think I have it solved. The trick is that the crm tools also need to know what the Pacemaker IPC buffer size is. I have set: /etc/sysconfig/pacemaker #export LRMD_MAX_CHILDREN="8" # Force use of a particular class of IPC connection # PCMK_ipc_type=shared-mem|socket|pos

Re: [Linux-HA] Pacemaker 1.19 cannot manage more than 127 resources

2013-08-29 Thread Andrew Beekhof
You'd have to ask suse. They'd know what the old and new are and therefor the differences between the two. On 30/08/2013, at 2:21 PM, Tom Parker wrote: > Do you know if this has changed significantly from the older versions? > This cluster was working fine before the upgrade. > > On Fri 30

Re: [Linux-HA] Pacemaker 1.19 cannot manage more than 127 resources

2013-08-29 Thread Tom Parker
Do you know if this has changed significantly from the older versions? This cluster was working fine before the upgrade. On Fri 30 Aug 2013 12:16:35 AM EDT, Andrew Beekhof wrote: > > On 30/08/2013, at 1:42 PM, Tom Parker wrote: > >> My pacemaker config contains the following settings: >> >> LRM

Re: [Linux-HA] Pacemaker 1.19 cannot manage more than 127 resources

2013-08-29 Thread Andrew Beekhof
On 30/08/2013, at 1:42 PM, Tom Parker wrote: > My pacemaker config contains the following settings: > > LRMD_MAX_CHILDREN="8" > export PCMK_ipc_buffer=3172882 perhaps go higher > > This is what I had today to get to 127 Resources defined. I am not sure what > I should choose for the PCMK_i

Re: [Linux-HA] Pacemaker 1.19 cannot manage more than 127 resources

2013-08-29 Thread Tom Parker
My pacemaker config contains the following settings: LRMD_MAX_CHILDREN="8" export PCMK_ipc_buffer=3172882 This is what I had today to get to 127 Resources defined. I am not sure what I should choose for the PCMK_ipc_type. Do you have any suggestions for large clusters? Thanks Tom On 08/29/20

Re: [Linux-HA] error: te_connect_stonith: Sign-in failed: triggered a retry

2013-08-29 Thread Tom Parker
This is happening when I am using the really large CIB and no. There doesn't seem to be anything else. 3 of my 6 nodes were showing this error. Now that I have deleted and recreated my CIB this log message seems to have gone away. On 08/29/2013 10:16 PM, Andrew Beekhof wrote: > On 30/08/2013,

Re: [Linux-HA] Pacemaker 1.19 cannot manage more than 127 resources

2013-08-29 Thread Andrew Beekhof
On 30/08/2013, at 5:49 AM, Tom Parker wrote: > Hello. Las night I updated my SLES 11 servers to HAE-SP3 which contains > the following versions of software: > > cluster-glue-1.0.11-0.15.28 > libcorosync4-1.4.5-0.18.15 > corosync-1.4.5-0.18.15 > pacemaker-mgmt-2.1.2-0.7.40 > pacemaker-mgmt-clie

Re: [Linux-HA] error: te_connect_stonith: Sign-in failed: triggered a retry

2013-08-29 Thread Andrew Beekhof
On 30/08/2013, at 5:51 AM, Tom Parker wrote: > Hello > > Since my upgrade last night I am also seeing this message in the logs on > my servers. > > error: te_connect_stonith: Sign-in failed: triggered a retry > > Old mailing lists seem to imply that this is an issue with heartbeat > which I d

[Linux-HA] error: te_connect_stonith: Sign-in failed: triggered a retry

2013-08-29 Thread Tom Parker
Hello Since my upgrade last night I am also seeing this message in the logs on my servers. error: te_connect_stonith: Sign-in failed: triggered a retry Old mailing lists seem to imply that this is an issue with heartbeat which I don't think I am running. My software stack is this at the moment:

[Linux-HA] Antw: Re: Rare issue with exportfs RA

2013-08-29 Thread Ulrich Windl
Hi! exportfs monitor timeouts can be a network or name(server) lookup issue. Also make sure you export using FQHNs. Regards, Ulrich >>> Dejan Muhamedagic schrieb am 29.08.2013 um 15:29 in Nachricht <20130829132922.GA4442@walrus.homenet>: > Hi, > > On Wed, Aug 07, 2013 at 05:27:14PM +0200, Cas

Re: [Linux-HA] Rare issue with exportfs RA

2013-08-29 Thread Dejan Muhamedagic
Hi, On Wed, Aug 07, 2013 at 05:27:14PM +0200, Caspar Smit wrote: > Hi all, > > Every couple of months i'm being hit by a very annoying exportfs issue: > > The symptoms are almost identical (except a few log messages) to this > previous thread: > > http://lists.linux-ha.org/pipermail/linux-ha/20

Re: [Linux-HA] Antw: A couple of questions regarding STONITH & fencing ...

2013-08-29 Thread Alex Sudakar
On Thu, Aug 29, 2013 at 4:18 PM, Ulrich Windl wrote: > After some short thinking I find that using ssh as STONITH is probably the > wrong thing to do, because it can never STONITH if the target is down already. > > Maybe some shared storage and a mechanism like sbd is the way to go. > With everyt