Re: [Pacemaker] Could not initialize corosync configuration API error 2

2013-10-30 Thread Andrew Beekhof
Jan: not sure if you're on the pacemaker list On 29 Oct 2013, at 6:43 pm, Bauer, Stefan (IZLBW Extern) wrote: > Dear Developers/Users, > > we’re using Pacemaker 1.1.7 and Corosync Cluster Engine 1.4.2 with Debian 6 > and a recent vanilla Kernel (3.10). > > On quite a lot of our clusters we

Re: [Pacemaker] Corosync hanging during stop

2013-10-30 Thread Andrew Beekhof
On 17 Oct 2013, at 8:05 pm, D.Gossrau wrote: > Hi Lars, > > On 10/12/2013 02:14 AM, Lars Ellenberg wrote: >> On Thu, Oct 10, 2013 at 04:06:46PM +0200, Detlef Gossrau wrote: >>> Hi, >>> >>> I created a cluster installation with two nodes. Everything is >>> running smoothly most of the time. But

Re: [Pacemaker] pacemaker shutdown under high load

2013-10-30 Thread Andrew Beekhof
On 17 Oct 2013, at 1:37 am, Alessandro Bono wrote: > On 16/10/2013 00:11, Andrew Beekhof wrote: >> On 09/10/2013, at 10:53 PM, Alessandro Bono >> wrote: >> >> >>> Hi >>> >>> >>> this week end my pacemaker shutdown on primary node during machine backup >>> attached compressed log of primary

Re: [Pacemaker] The larger cluster is tested.

2013-10-30 Thread Andrew Beekhof
On 29 Oct 2013, at 12:12 am, yusuke iida wrote: > Hi, Andrew > > I tested using following commit. > https://github.com/beekhof/pacemaker/commit/b6fa1e650f64b1ba73fdb143f41323aa8cb3544e > > However, timeout of operation has still occurred. > > I analyzed the log. > > I am noting that it is la

Re: [Pacemaker] An internal error occurred in crmd

2013-10-30 Thread Andrew Beekhof
I think this should be fixed by: https://github.com/beekhof/pacemaker/commit/ea7991f The underlying issue though, is that the lrmd command timed out, which _should_ have been fixed by: https://github.com/beekhof/pacemaker/commit/d65b270 What are you doing to this poor cluster? :) On 21 Oc

Re: [Pacemaker] Stonith issue with fence_virsh

2013-10-30 Thread Andrew Beekhof
Personally I use fence_xvm. IIRC, it's the supported equivalent of fence_virsh. On 24 Oct 2013, at 6:38 pm, Beo Banks wrote: > hi, > > i have enable the debug option and i use the ip instead of hostname > > primitive stonith-zarafa02 stonith:fence_virsh \ > params pcmk_host_list="zaraf

Re: [Pacemaker] resources does not start on survied node after reboot

2013-10-30 Thread Andrew Beekhof
On 30 Oct 2013, at 1:12 am, Саша Александров wrote: > Hi! > > I have a 2-node cluster with shared storage and SBD-fencing. > One node was down for maintenance. > Due to external reasons, second node was rebotted. After reboot service never > got up: > > Oct 29 13:04:21 wcs2 pengine[2362]: wa

Re: [Pacemaker] An internal error occurred in crmd

2013-10-30 Thread Kazunori INOUE
Hi Andrew, 2013/10/31 Andrew Beekhof : > I think this should be fixed by: >https://github.com/beekhof/pacemaker/commit/ea7991f I confirmed that it was fixed. Many thanks, > > The underlying issue though, is that the lrmd command timed out, which > _should_ have been fixed by: >https://g