Re: [Pacemaker] [PATCH] crm_mon expansion patch

2010-03-24 Thread Yuusuke IIDA
Hi Andrew, Thank you for a reply. (2010/03/24 16:58), Andrew Beekhof wrote: Although I'll probably change it slightly to differentiate between partial and total failure: + if(crm_parse_int(attrvalue, "0")<= 0) { + print_as("\t: Link has failed (Expected=%d)", e

Re: [Pacemaker] Centos 5 and GFS

2010-03-24 Thread Cristian Mammoli - Apra Sistemi
Cristian Mammoli - Apra Sistemi wrote: Thank you Andrew, I'll try to backport it. Checking kernel: Current kernel version: 2.6.18 Minimum kernel version: 2.6.31 FAILED! I guess I'll have to wait for RHEL6 ;-( Better chances with OCFS2? -- Cristian Mammoli APRA SISTEMI srl Via Brodolini,6

Re: [Pacemaker] DRBD 2 node cluster and STONITH configuration help?required.

2010-03-24 Thread Lars Ellenberg
On Wed, Mar 24, 2010 at 07:59:26PM +, Mario Giammarco wrote: > Andrew Beekhof writes: > > > > > Have you seen: > >http://www.clusterlabs.org/doc/crm_fencing.html > > I have been led to believe that STONITH > > > will help prevent split brain situations, but the LINBIT instructions do >

Re: [Pacemaker] Centos 5 and GFS

2010-03-24 Thread Cristian Mammoli - Apra Sistemi
Andrew Beekhof wrote: The piece thats probably missing is (dlm|gfs)_controld.pcmk from cluster 3.0.7 (or later). No idea what kernel or other GFS requirements that sucks in. Thank you Andrew, I'll try to backport it. Cheers -- Cristian Mammoli APRA SISTEMI srl Via Brodolini,6 Jesi (AN) tel di

Re: [Pacemaker] WARN: Rexmit of seq ..........

2010-03-24 Thread Lars Ellenberg
On Wed, Mar 24, 2010 at 11:40:54AM +, Joseph, Lester wrote: > Yes they can. That's the confusing bit > > As I mentioned in previous email, we rebooted the switch and the nodes > would have lost connectivity briefly. But the switch is back online > now and the nodes have connectivity as the

Re: [Pacemaker] Collocation Resources

2010-03-24 Thread Travis Dolan
I believe I have found the appropriate where I need to go. Looks like the bug is assigned to you, let me know if I am incorrect. Cheers. On Tue, Mar 23, 2010 at 4:43 PM, Travis Dolan wrote: > I would like to know if it is possible to configure more than two resources > within a collocation grou

[Pacemaker] configuring the monitor interval

2010-03-24 Thread Alan Jones
Friends, The ocf:pacemaker:Dummy example resource agent script specifies a default monitoring interval (10) which I assume is 10 seconds. This seems like the appropriate place to specify this interval, ie. the resource implementation knows how heavy weight the monitor is and what is a good comprom

Re: [Pacemaker] Collocation Resources

2010-03-24 Thread Travis Dolan
Hello Andrew, I have created the report. I do not know how to access your bugzilla. Could you please let me know where/what to do. Thanks in advance. On Wed, Mar 24, 2010 at 12:49 AM, Andrew Beekhof wrote: > On Wed, Mar 24, 2010 at 12:43 AM, Travis Dolan wrote: > > I would like to know if it

Re: [Pacemaker] DRBD 2 node cluster and STONITH configuration help?required.

2010-03-24 Thread Matthew Palmer
On Wed, Mar 24, 2010 at 07:59:26PM +, Mario Giammarco wrote: > Andrew Beekhof writes: > > Have you seen: > >http://www.clusterlabs.org/doc/crm_fencing.html > > I have been led to believe that STONITH > > > will help prevent split brain situations, but the LINBIT instructions do > > > not

Re: [Pacemaker] DRBD 2 node cluster and STONITH configuration help required.

2010-03-24 Thread Andrew Beekhof
On Wed, Mar 24, 2010 at 8:59 PM, Mario Giammarco wrote: > Andrew Beekhof writes: > >> >> Have you seen: >>    http://www.clusterlabs.org/doc/crm_fencing.html >> I have been led to believe that STONITH >> > will help prevent split brain situations, but the LINBIT instructions do >> > not >> > pro

Re: [Pacemaker] Centos 5 and GFS

2010-03-24 Thread Andrew Beekhof
On Wed, Mar 24, 2010 at 7:19 PM, Cristian Mammoli - Apra Sistemi wrote: > What do I need to run a GFS filesystem on top of a dual primary drbd on > Centos 5.4? > The GFS packages from the distro are OK? The piece thats probably missing is (dlm|gfs)_controld.pcmk from cluster 3.0.7 (or later). No

Re: [Pacemaker] DRBD 2 node cluster and STONITH configurati on help required.

2010-03-24 Thread Mario Giammarco
Andrew Beekhof writes: > > Have you seen: >http://www.clusterlabs.org/doc/crm_fencing.html > I have been led to believe that STONITH > > will help prevent split brain situations, but the LINBIT instructions do not > > provide any guidance on how to conifgure STONITH in the pacemaker cluster.

Re: [Pacemaker] Packaging of corosync 1.2.1

2010-03-24 Thread Andrew Beekhof
On Wed, Mar 24, 2010 at 2:23 PM, Andreas Mock wrote: > Hi Andrew, > > you said that you're interested in the results of CTS running against > pacemaker 1.0.8/corosync on real servers with openSuSE 11.2: > > I get quickly problems with the meanwhile identified (thanks to S. Drake) > bug in shutting

[Pacemaker] Centos 5 and GFS

2010-03-24 Thread Cristian Mammoli - Apra Sistemi
What do I need to run a GFS filesystem on top of a dual primary drbd on Centos 5.4? The GFS packages from the distro are OK? I have pacemaker installed from the the official repos of clusterlabs: pacemaker-libs-1.0.8-1.el5 pacemaker-1.0.8-1.el5 corosynclib-1.2.0-1.el5 corosync-1.2.0-1.el5 openais

Re: [Pacemaker] [PATCH] Show utilization/capacity information

2010-03-24 Thread Andrew Beekhof
Pushed. Good work! On Wed, Mar 24, 2010 at 1:48 PM, Yan Gao wrote: > On 03/24/10 18:52, Andrew Beekhof wrote: >> On Wed, Mar 24, 2010 at 10:02 AM, Yan Gao wrote: >>> Hi, >>> A suggestion/request was raised with regard to the utilization feature: >>> The ability to see the remaining capacity of n

Re: [Pacemaker] Feedback: Website Updates

2010-03-24 Thread Andrew Beekhof
On Wed, Mar 24, 2010 at 2:14 PM, wrote: > Hi, > > first of all: the succesful redesign of the clusterlabs website looks very > nice and the splash page is a really good idea! > > However I found some small teething troubles: > > * "Explore" tab: > - The link "Site updates" is not working yet. fi

[Pacemaker] Packaging of corosync 1.2.1

2010-03-24 Thread Andreas Mock
Hi Andrew, you said that you're interested in the results of CTS running against pacemaker 1.0.8/corosync on real servers with openSuSE 11.2: I get quickly problems with the meanwhile identified (thanks to S. Drake) bug in shutting down pacemaker and friends which causes corosync to hang around i

[Pacemaker] Feedback: Website Updates

2010-03-24 Thread martin . braun
Hi, first of all: the succesful redesign of the clusterlabs website looks very nice and the splash page is a really good idea! However I found some small teething troubles: * "Explore" tab: - The link "Site updates" is not working yet. - I would also suggest to add a direct link to the wiki (Ho

Re: [Pacemaker] [PATCH] Show utilization/capacity information

2010-03-24 Thread Yan Gao
On 03/24/10 18:52, Andrew Beekhof wrote: > On Wed, Mar 24, 2010 at 10:02 AM, Yan Gao wrote: >> Hi, >> A suggestion/request was raised with regard to the utilization feature: >> The ability to see the remaining capacity of nodes, and how the load is >> distributed right now. >> >> I've added a new

Re: [Pacemaker] WARN: Rexmit of seq ..........

2010-03-24 Thread Joseph, Lester
Yes they can. That's the confusing bit As I mentioned in previous email, we rebooted the switch and the nodes would have lost connectivity briefly. But the switch is back online now and the nodes have connectivity as they did before. Yet the message is constantly being generated despite my

Re: [Pacemaker] About replacement of clone and handling of the fail number of times.

2010-03-24 Thread Andrew Beekhof
2010/3/24 : > Hi Andrew, > >> Do you mean: why is the clone on srv01 always $clone:0 but on srv02 >> its sometimes $clone:0 and sometimes $clone:1 ? > > yes. > > The replacement thought both nodes to be the same movement. > Because it is "globally-unique=false". globally-unique=false" means t

Re: [Pacemaker] WARN: Rexmit of seq ..........

2010-03-24 Thread Andrew Beekhof
On Wed, Mar 24, 2010 at 11:58 AM, Joseph, Lester < lester.jos...@galacoral.com> wrote: > Thanks. > > The nodes in question are Virtual machines. Interestingly, we had a > maintenance window this morning to restart the switches on the bladecenter > serving this cluster. > > > > Do I have to reboot

Re: [Pacemaker] WARN: Rexmit of seq ..........

2010-03-24 Thread Joseph, Lester
Thanks. The nodes in question are Virtual machines. Interestingly, we had a maintenance window this morning to restart the switches on the bladecenter serving this cluster. Do I have to reboot the VM's to resolve this. This is only beeing seen on one of the nodes. I have already put this node i

Re: [Pacemaker] [PATCH] Show utilization/capacity information

2010-03-24 Thread Andrew Beekhof
On Wed, Mar 24, 2010 at 10:02 AM, Yan Gao wrote: > Hi, > A suggestion/request was raised with regard to the utilization feature: > The ability to see the remaining capacity of nodes, and how the load is > distributed right now. > > I've added a new option "--show-utilization" of ptest to achieve t

Re: [Pacemaker] WARN: Rexmit of seq ..........

2010-03-24 Thread Andrew Beekhof
Switch going bad? Not sure, I don't use heartbeat much these days. On Wed, Mar 24, 2010 at 10:15 AM, Joseph, Lester < lester.jos...@galacoral.com> wrote: > Hi, > > Not sure if this has already mean addressed in this list….apologies if it > has. > > Noticed a similar thread here > http://www.gos

[Pacemaker] WARN: Rexmit of seq ..........

2010-03-24 Thread Joseph, Lester
Hi, Not sure if this has already mean addressed in this listapologies if it has. Noticed a similar thread here http://www.gossamer-threads.com/lists/linuxha/dev/32877?do=post_view_threaded. But no viable solution. In a two node cluster, which has been working fine for months now, I am now g

[Pacemaker] [PATCH] Show utilization/capacity information

2010-03-24 Thread Yan Gao
Hi, A suggestion/request was raised with regard to the utilization feature: The ability to see the remaining capacity of nodes, and how the load is distributed right now. I've added a new option "--show-utilization" of ptest to achieve that. Attached the patch. Please help review it. Thanks a lot

Re: [Pacemaker] About replacement of clone and handling of the fail number of times.

2010-03-24 Thread renayama19661014
Hi Andrew, > Do you mean: why is the clone on srv01 always $clone:0 but on srv02 > its sometimes $clone:0 and sometimes $clone:1 ? yes. The replacement thought both nodes to be the same movement. Because it is "globally-unique=false". Best Regards, Hideo Yamauchi. --- Andrew Beekhof wrot

Re: [Pacemaker] [PATCH] crm_mon expansion patch

2010-03-24 Thread Andrew Beekhof
On Wed, Mar 24, 2010 at 3:32 AM, Yuusuke IIDA wrote: > Hi Andrew, > > Please confirm it last time because I revised loop processing pointed out. I like it :-) Although I'll probably change it slightly to differentiate between partial and total failure: + if(crm_parse_int(attrvalue

Re: [Pacemaker] About replacement of clone and handling of the fail number of times.

2010-03-24 Thread Andrew Beekhof
2010/3/24 : > Hi Andrew, > > Thank you for comment. > >> So if I can summarize, you're saying that clnUMdummy02 should not be >> allowed to run on srv01 because the combined number of failures is 6 >> (and clnUMdummy02 is a non-unique clone). >> >> And that the current behavior is that clnUMdummy0

Re: [Pacemaker] Collocation Resources

2010-03-24 Thread Andrew Beekhof
On Wed, Mar 24, 2010 at 12:43 AM, Travis Dolan wrote: > I would like to know if it is possible to configure more than two resources > within a collocation group. > > Simply put I have 10 Virtual IPs that will need to migrate from Node A to > Node B in the event of any failures. I also need these I

Re: [Pacemaker] pacemaker resource constraints

2010-03-24 Thread Andrew Beekhof
On Tue, Mar 23, 2010 at 11:51 PM, Alan Jones wrote: > BTW: The order matters in the colocation rule.  When I configure: > colocation colo-master_worker -1: master worker > Then "failback" is blocked by the stickiness.  In my opinion this is a bug, > but others may have an explanation. The order i