Re: [Pacemaker] trouble with quorum

2013-05-23 Thread Andrey Groshev
24.05.2013, 01:39, "Andrew Beekhof" : > On 24/05/2013, at 3:49 AM, Andrey Groshev wrote: > >>  23.05.2013, 02:51, "Andrew Beekhof" : >>>  On 22/05/2013, at 10:25 PM, Groshev Andrey wrote:   Hello,   I try build cluster with 2 nodes + one quorum node (without pacemaker). >>>  This

Re: [Pacemaker] Pacemaker still may include memory leaks

2013-05-23 Thread Vladislav Bogdanov
24.05.2013 06:34, Andrew Beekhof wrote: > Any help figuring out where the leaks might be would be very much appreciated > :) One (and the only) suspect is unfortunately crmd itself. It has private heap grown from 2708 to 3680 kB. All other relevant differences are in qb shm buffers, which are co

Re: [Pacemaker] [Question and Problem] In vSphere5.1 environment, IO blocking of pengine occurs at the time of shared disk trouble for a long time.

2013-05-23 Thread Vladislav Bogdanov
22.05.2013 09:05, Andrew Beekhof wrote: > > On 17/05/2013, at 4:17 PM, Vladislav Bogdanov wrote: > >> P.S. Andrew, is this patch ok to apply? > > https://github.com/beekhof/pacemaker/commit/c7e10c6 :) Awesome. Thanks. ___ Pacemaker mailing list: Pa

Re: [Pacemaker] [Question and Problem] In vSphere5.1 environment, IO blocking of pengine occurs at the time of shared disk trouble for a long time.

2013-05-23 Thread Vladislav Bogdanov
24.05.2013 07:58, renayama19661...@ybb.ne.jp wrote: > Hi Andrew, > Hi Vladislav, > >> We test movement when we located pe file in tmpfs repeatedly. >> It seems to move well for the moment. > > I only adopted tmpfs, and the I/O block of pengine was improved. Great that it helped. > I confirm the

Re: [Pacemaker] [Question and Problem] In vSphere5.1 environment, IO blocking of pengine occurs at the time of shared disk trouble for a long time.

2013-05-23 Thread renayama19661014
Hi Andrew, Hi Vladislav, > We test movement when we located pe file in tmpfs repeatedly. > It seems to move well for the moment. I only adopted tmpfs, and the I/O block of pengine was improved. I confirm the synchronization with the fixed file, but think that there is not the problem from now on

Re: [Pacemaker] S_POLICY_ENGINE state continues being maintained

2013-05-23 Thread Andrew Beekhof
On 24/05/2013, at 2:19 PM, Andrew Beekhof wrote: > > On 23/05/2013, at 4:44 PM, Kazunori INOUE wrote: > >> Hi, >> >> I'm using pacemaker-1.1 (c3486a4a8d. the latest devel). >> After fencing caused by split-brain failed 11 times, S_POLICY_ENGINE state >> is kept even if I recover split-brain

Re: [Pacemaker] pacemaker-remote tls handshaking

2013-05-23 Thread David Vossel
- Original Message - > From: "Lindsay Todd" > To: "The Pacemaker cluster resource manager" > Sent: Thursday, May 23, 2013 4:35:02 PM > Subject: Re: [Pacemaker] pacemaker-remote tls handshaking > > Working on this problem further... > > On Tue, May 21, 2013 at 5:14 PM, David Vossel wrot

Re: [Pacemaker] S_POLICY_ENGINE state continues being maintained

2013-05-23 Thread Andrew Beekhof
On 23/05/2013, at 4:44 PM, Kazunori INOUE wrote: > Hi, > > I'm using pacemaker-1.1 (c3486a4a8d. the latest devel). > After fencing caused by split-brain failed 11 times, S_POLICY_ENGINE state is > kept even if I recover split-brain. Odd, I get: May 24 00:17:08 corosync-host-1 crmd[3056]: n

Re: [Pacemaker] Pacemaker still may include memory leaks

2013-05-23 Thread Andrew Beekhof
Any help figuring out where the leaks might be would be very much appreciated :) Also, the measurements are in pages... could you run "getconf PAGESIZE" and let us know the result? I'm guessing 4096 bytes. On 23/05/2013, at 5:47 PM, Yuichi SEINO wrote: > Hi, > > I retry the test after we upda

Re: [Pacemaker] unmanaged resource stopped the group

2013-05-23 Thread Andrew Beekhof
On 23/05/2013, at 8:52 PM, Alexandr A. Alexandrov wrote: > Hi, All! > > On one of my clusters I have resources groups, second group depends on first > resource in the first group. Today I needed to restart one service from the > first group (no dependancies other than group), so I made in unm

Re: [Pacemaker] newbie question(s)

2013-05-23 Thread Alex Samad - Yieldbroker
> -Original Message- > From: Florian Crouzat [mailto:gen...@floriancrouzat.net] > Sent: Thursday, 23 May 2013 6:27 PM > To: pacemaker@oss.clusterlabs.org > Subject: Re: [Pacemaker] newbie question(s) > [snip] > > You could also wait for a failover where the VIP or (any resource) will fa

Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?

2013-05-23 Thread Andrew Beekhof
On 24/05/2013, at 2:43 AM, Andrew Widdersheim wrote: > After setting the crmd-transition-delay to 2 * my ping monitor interval the > issues I was seeing before in testing have not re-occurred. Even a couple of seconds should be plenty. The dampen value gets them almost arriving at the same ti

Re: [Pacemaker] pacemaker-remote tls handshaking

2013-05-23 Thread Andrew Beekhof
On 24/05/2013, at 7:35 AM, Lindsay Todd wrote: > Working on this problem further... > > On Tue, May 21, 2013 at 5:14 PM, David Vossel wrote: >> I'd suggest this. Try running the pacemaker_remote regression test and see >> what happens. This will start up >> an instance of pacemaker_remote l

Re: [Pacemaker] pacemaker-remote tls handshaking

2013-05-23 Thread Lindsay Todd
Working on this problem further... On Tue, May 21, 2013 at 5:14 PM, David Vossel wrote: > I'd suggest this. Try running the pacemaker_remote regression test and see > what happens. This will start up > an instance of pacemaker_remote locally and issue client commands to it to > test both the

Re: [Pacemaker] trouble with quorum

2013-05-23 Thread Andrew Beekhof
On 24/05/2013, at 3:49 AM, Andrey Groshev wrote: > > > 23.05.2013, 02:51, "Andrew Beekhof" : >> On 22/05/2013, at 10:25 PM, Groshev Andrey wrote: >> >>> Hello, >>> >>> I try build cluster with 2 nodes + one quorum node (without pacemaker). >> >> This is the root of your problem. >> >> Y

Re: [Pacemaker] trouble with quorum

2013-05-23 Thread Andrey Groshev
23.05.2013, 02:51, "Andrew Beekhof" : > On 22/05/2013, at 10:25 PM, Groshev Andrey wrote: > >>  Hello, >> >>  I try build cluster with 2 nodes + one quorum node (without pacemaker). > > This is the root of your problem. > > Your config has: > >>  service { >>  name: pacemaker >>

Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?

2013-05-23 Thread Andrew Widdersheim
After setting the crmd-transition-delay to 2 * my ping monitor interval the issues I was seeing before in testing have not re-occurred. Thanks again for the help. ___ Pacemaker mailing list: Pacemaker@oss.clusterl

Re: [Pacemaker] error: do_exit: Could not recover from internal error

2013-05-23 Thread Andrew Beekhof
On 23/05/2013, at 9:47 PM, Brian J. Murrell wrote: > On 13-05-22 07:05 PM, Andrew Beekhof wrote: >> >> Also, 1.1.8-7 was not tested with the plugin _at_all_ (and neither will >> future RHEL builds). > > Was 1.1.7-* in EL 6.3 tested with the plugin? No. Which is why I didn't notice logging w

Re: [Pacemaker] error: do_exit: Could not recover from internal error

2013-05-23 Thread Brian J. Murrell
On 13-05-22 07:05 PM, Andrew Beekhof wrote: > > Also, 1.1.8-7 was not tested with the plugin _at_all_ (and neither will > future RHEL builds). Was 1.1.7-* in EL 6.3 tested with the plugin? Is staying with most recent EL 6.3 pacemaker-1.1.7 release really the more stable option for people not a

[Pacemaker] unmanaged resource stopped the group

2013-05-23 Thread Alexandr A. Alexandrov
Hi, All! On one of my clusters I have resources groups, second group depends on first resource in the first group. Today I needed to restart one service from the first group (no dependancies other than group), so I made in unmanaged: May 23 14:14:22 kennedy

Re: [Pacemaker] Release candidate: 1.1.10-rc3

2013-05-23 Thread Andrew Beekhof
grrr. because for some reason git needs "git push --tags". fixed, thanks for letting me know On 23/05/2013, at 6:30 PM, Johan Huysmans wrote: > Hi All, > > I've builded an rpm as described below, however I can see during the build > that rc2 is used in there is no mentioning of rc3. > It seem

Re: [Pacemaker] Release candidate: 1.1.10-rc3

2013-05-23 Thread Johan Huysmans
Hi All, I've builded an rpm as described below, however I can see during the build that rc2 is used in there is no mentioning of rc3. It seems that there is no rc3 tag available: $ git tag -l | grep Pacemaker | sort -Vr | grep rc Pacemaker-1.1.10-rc2 Pacemaker-1.1.10-rc1 gr. Johan On 23-05-1

Re: [Pacemaker] newbie question(s)

2013-05-23 Thread Florian Crouzat
Le 22/05/2013 02:13, Alex Samad - Yieldbroker a écrit : > >Any help or suggestions muchly appreciated > > > >Also, fencing! Not sure that I need it. The app is always running on both nodes, it's just the ip address that is shared A cluster without fencing is not a cluster, by definition. Pr

Re: [Pacemaker] Pacemaker still may include memory leaks

2013-05-23 Thread Yuichi SEINO
Hi, I retry the test after we updated packages to the latest tag and OS. glue and booth is latest. * Environment OS:RHEL 6.4 cluster-glue:latest(commit:2755:8347e8c9b94f) + patch[detail:http://www.gossamer-threads.com/lists/linuxha/dev/85787] resource-agent:v3.9.5 libqb:v0.14.4 corosync:v2.3.0 pa

Re: [Pacemaker] S_POLICY_ENGINE state continues being maintained

2013-05-23 Thread Nikita Staroverov
23.05.2013 10:58, Andrew Beekhof пишет: On 23/05/2013, at 4:44 PM, Kazunori INOUE wrote: Hi, I'm using pacemaker-1.1 (c3486a4a8d. the latest devel). After fencing caused by split-brain failed 11 times, S_POLICY_ENGINE state is kept even if I recover split-brain. Well thats annoying, I'll ha

Re: [Pacemaker] S_POLICY_ENGINE state continues being maintained

2013-05-23 Thread Andrew Beekhof
On 23/05/2013, at 4:44 PM, Kazunori INOUE wrote: > Hi, > > I'm using pacemaker-1.1 (c3486a4a8d. the latest devel). > After fencing caused by split-brain failed 11 times, S_POLICY_ENGINE state is > kept even if I recover split-brain. Well thats annoying, I'll have a look in the morning. > >