Re: [Pacemaker] Patches for VirtualDomain RA

2011-08-08 Thread Dominik Klein
1) During stop operation libvirt occasionally returns an error because the state cannot be determined just the moment the machine is shut down. This patch makes the RA try to get the state again one time. If the machine is down then everything is OK. 2) The next problem is that a graceful

Re: [Pacemaker] init script VS pacemaker to start a service

2011-06-08 Thread Dominik Klein
On 06/07/2011 07:09 PM, CeR wrote: Hi there! I have some doubts, hope you folks can help me. In a system I have two (or more) ways to start a daemon: A) /etc/init.d/ script. The service could be started by the system (/etc/rcX) or by me manually. B) The daemon has an executable with an

Re: [Pacemaker] Location issue

2011-06-08 Thread Dominik Klein
On 06/08/2011 10:39 AM, ruslan usifov wrote: Hello I have follow constraint: location ms_drbd_web-U_slave_on_drbd3 ms_drbd_web-U \ rule role=slave inf: #uname eq drbd3 Which as i think it prevents slave role from launch on all hosts except drbd3, nope it says put the slave

Re: [Pacemaker] Location issue

2011-06-08 Thread Dominik Klein
but when i shutdown drbd3 host Pacemaker try start slave role on other host. How can i prevent this behavior? try s/inf/-inf s/eq/neq ne actually, sorry ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org

Re: [Pacemaker] Location issue

2011-06-08 Thread Dominik Klein
drbd3 result is identical, pacemaker try launch slave role on other nodes:-((( 2011/6/8 Dominik Klein d...@in-telegence.net mailto:d...@in-telegence.net but when i shutdown drbd3 host Pacemaker try start slave role on other host. How can i prevent this behavior? try

Re: [Pacemaker] Statefull firewall cluster Active/Pasive with conntrackd issues

2011-05-11 Thread Dominik Klein
netfilter is smarter than you think it is. It can distinguish between packet flows forming an allowed flow and actually invalid packets. That's default behaviour. This only works if there's no helper module needed. So with the likes of NAT or FTP connections, this will not work without

Re: [Pacemaker] pacemaker keeps crashing

2011-03-16 Thread Dominik Klein
On 03/15/2011 03:51 PM, Andrew Beekhof wrote: On Tue, Mar 15, 2011 at 2:35 PM, Dominik Klein d...@in-telegence.net wrote: Hi I installed a new 3 node cluster today. I used the instructions on the install page from the wiki and up to corosync start everything went smooth. At that point

[Pacemaker] pacemaker keeps crashing

2011-03-15 Thread Dominik Klein
Hi I installed a new 3 node cluster today. I used the instructions on the install page from the wiki and up to corosync start everything went smooth. At that point, apparently the following loop of corosync spawning pacemaker and pacemaker crashing starts. See logs on

Re: [Pacemaker] Resource-Monitoring with an On Fail-Action

2010-03-16 Thread Dominik Klein
Tom Tux wrote: Hi I've have a question about the resource-monitoring: I'm monitoring an ip-resource every 20 seconds. I have configured the On Fail-action with restart. This works fine. If the monitor-operation fails, then the resource will be restartet. But how can I define this

Re: [Pacemaker] [Patch showscores.sh]

2010-03-15 Thread Dominik Klein
err, yeah. That wasn't right. Use this one. Regards Dominik Dominik Klein wrote: Minor Update. Just noticed it doesn't display stickiness=0 if stickiness is unset. So failcound and migration-threshold columns were mixed up. Patch against stable-1.0 Regards Dominik exporting patch: # HG

Re: [Pacemaker] Breaking pacemaker

2010-02-16 Thread Dominik Klein
jimbob palmer wrote: Hello, I have a cluster that is all working perfectly. Time to break it. This is a two node master/slave cluster with drbd. Failover between the nodes works backwards and forwards. Everything is happier than a well fed cat. I wanted to see what would happen if the

Re: [Pacemaker] High load issues

2010-02-05 Thread Dominik Klein
But generally I believe this test case is invalid. I might agree here that this test case does not necessarily reproduce what happened on my production system (unfortunately I do not know for sure what happened there, the dev who caused this just tells me he used some stupid sql statement and

Re: [Pacemaker] Fwd: [Cluster-devel] Organizing Bug Squash PartyforCluster 3.x, GFS2 and more

2010-02-03 Thread Dominik Klein
Koch, Sebastian wrote: Ahh great, that's good news. I've never been to australia hehe. If it would be in germany or maybe austria i will participate and try my best to help squash bugs. But i am no dveeloper i am more a technician. I may be wrong here, but I think this party will have

Re: [Pacemaker] Fwd: [Cluster-devel] Organizing Bug Squash Partyfor Cluster 3.x, GFS2 and more

2010-02-02 Thread Dominik Klein
Koch, Sebastian wrote: Hi, i am kind of new in the whole cluster stuff but i would like to participateand contribute. But the main questions in which country ;-) I'd guess in #linux-cluster country, no? :) ___ Pacemaker mailing list

Re: [Pacemaker] APC Master Stonith

2010-01-19 Thread Dominik Klein
Errol Neal wrote: On Tue, Jan 19, 2010 04:19 PM, Sander van Vugt m...@sandervanvugt.nl wrote: Hi, I hope someone has configured the APC Master Stonith resource (which you would use to have pacemaker to a device like the APC switched rack PDU), as I have a - probably extremely stupid -

[Pacemaker] corosync init script broken

2009-12-28 Thread Dominik Klein
Hi cluster people been a while, couldn't really follow things. Today I was tasked to install a new cluster, went for 1.0.6 and corosync as described on the wiki and hit this: New cluster with pacemaker 106 and latest available corosync from the clusterlabs.org/rpm opensuse 11.1 repo. This

Re: [Pacemaker] Looking for correct constraints

2009-10-30 Thread Dominik Klein
Michael Schwartzkopff wrote: Am Freitag, 30. Oktober 2009 13:26:35 schrieb Lars Marowsky-Bree: On 2009-10-30T13:19:52, Michael Schwartzkopff mi...@multinet.de wrote: I have a three node cluster. I have two resources that are not allowed to run together in the cluster. Basically resource2 is a

Re: [Pacemaker] Looking for correct constraints

2009-10-30 Thread Dominik Klein
Maybe set a cluster-wide attribute, which, when set, does not allow res2 to run. Ie rule with score -infinity. res1 could remove this attribute while starting and set this attribute when stopping. This does not make any sense. Sorry, let me try again. res1 start = set attribute res1 stop =

Re: [Pacemaker] strange behaviour for ssh and eth0

2009-10-28 Thread Dominik Klein
gilberto migliavacca wrote: Hi Dominik How can I configure the node's ips as cluster resources? sorry for the silly question but I'm a newbye in this field thanks in advance gilberto Dominik Klein wrote: gilberto migliavacca wrote: Hi I have 2 nodes and 1 node that I'm using

Re: [Pacemaker] bug: multi state and target-role=started results in promote

2009-10-15 Thread Dominik Klein
i thought that for multistate resources, Started == Slave. am i mistaken? did this change some time ago? Afaik, that was only true for status display in crm_mon. But also, that was fixed quite a while ago. Regards Dominik ___ Pacemaker mailing list

Re: [Pacemaker] Migration, constraints and failback off

2009-09-02 Thread Dominik Klein
Diego Woitasen wrote: HI I'm building a two node cluster with Xen, DRBD and Pacemaker+Heartbeat. I've set default_resouce_stickiness to INFINITY to disable failback (I want to handle it manually). When I want to migrate a resource I execute crm resource migrate gw-piso-lab and crm

Re: [Pacemaker] Slave does not get become Master after unplugging power cable at master

2009-08-18 Thread Dominik Klein
hj lee wrote: Thank very much for the reply. I tested it both stonith-enabled and no-quorum-policy. As Dejan pointed, this is related to stonith-enabled. With stonith-enabled true (which is default), if I kill the master node, the slave stays as a slave, it seems expecting something from

Re: [Pacemaker] Problem with ocf:heartbeat:mysql

2009-08-17 Thread Dominik Klein
Michal wrote: Hi, When I try to start mysql with config: primitive drbd1 ocf:heartbeat:drbd \ params drbd_resource=db \ op monitor role=Master interval=59s timeout=30s \ op monitor role=Slave interval=60s timeout=30s ms ms-drbd1 drbd1 \ meta clone-max=2 master-max=1 master-node-max=1

Re: [Pacemaker] Problem with ocf:heartbeat:mysql

2009-08-17 Thread Dominik Klein
Dominik Klein wrote: Michal wrote: Hi, When I try to start mysql with config: primitive drbd1 ocf:heartbeat:drbd \ params drbd_resource=db \ op monitor role=Master interval=59s timeout=30s \ op monitor role=Slave interval=60s timeout=30s ms ms-drbd1 drbd1 \ meta clone-max=2 master-max=1

Re: [Pacemaker] RFC: Better error reporting for RAs.

2009-08-03 Thread Dominik Klein
Though I don't see the point, grepping for the resource id is usually just as effective. I totally agree here. I have helped quite a few people understand their problems on IRC and grepping the resource id usually works well. I'd suggest focusing on improving the error logging that most RAs

Re: [Pacemaker] Some showscores.sh questions

2009-05-18 Thread Dominik Klein
Whether it's in an RPM or not, could the author add a license header to it? dk: what license do you want? Just use what you use for all the cluster code. Regards Dominik ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org

Re: [Pacemaker] How to create OCF Resource Agents

2009-04-28 Thread Dominik Klein
Paul Osier wrote: I'm trying to create an OCF resource agent that will start/stop/monitor SER. I've read through the opencf.org resource agent api doc and the wiki.linux-ha.org OCF resource agent doc and all those documents talk about is what is needed in the resource agent, not necessarily

Re: [Pacemaker] [Linux-HA] showscores.sh for pacemaker 1.0.2

2009-04-01 Thread Dominik Klein
Bruno Voigt wrote: Hi Dominik, I use your script occasionally, together with Pacemaker packaged for Debian by martin.loschw...@linbit.com. When running the new version I get as first output line: tail: cannot open `+2' for reading: No such file or directory and then the resource scores

Re: [Pacemaker] couple of comments/questions on DRBD HowTo 1.0

2009-03-30 Thread Dominik Klein
Juha Heinanen wrote: Dominik Klein writes: The bug has been reported to Dejan (the crm shell dev) and he will fix it. are all bugs fixed also in OpenAIS 0.80.x branch (whitetank), which is labelled on openais.com site as the stable release? The crm shell is a part of pacemaker

Re: [Pacemaker] couple of comments/questions on DRBD HowTo 1.0

2009-03-30 Thread Dominik Klein
Dominik Klein wrote: Juha Heinanen wrote: Lars Ellenberg writes: If that Lars? meant me, yes, please, go ahead an delete outdated examples. Replace with a reference to the drbd users guide http://www.drbd.org/docs/about/ or http://www.drbd.org/docs/install/ how about

Re: [Pacemaker] couple of comments/questions on DRBD HowTo 1.0

2009-03-29 Thread Dominik Klein
Juha Heinanen wrote: Lars Ellenberg writes: If that Lars? meant me, yes, please, go ahead an delete outdated examples. Replace with a reference to the drbd users guide http://www.drbd.org/docs/about/ or http://www.drbd.org/docs/install/ how about the webserver example in DRBD

Re: [Pacemaker] restart of resource is not attempted

2009-03-29 Thread Dominik Klein
Juha Heinanen wrote: i moved all my resources to the standby node. on this node, mysql resource had a problem that prevented it from starting. i fixed the problem and assumed that pacemaker would now automatically start mysql, but it does not even try. it gave up after the first error even

Re: [Pacemaker] couple of comments/questions on DRBD HowTo 1.0

2009-03-27 Thread Dominik Klein
Lars Ellenberg wrote: On Wed, Mar 18, 2009 at 10:17:24AM +0100, Dominik Klein wrote: Juha Heinanen wrote: Prerequisites section says that DRBD must not be started by init.. In Debian lenny at least, drbd init script load drbd module. if drbs init is not run, drbd modules needs to be loaded

Re: [Pacemaker] Monitor a resource without the cluster reacting to the result...

2009-03-26 Thread Dominik Klein
Joe Bill wrote: Hi Dominik! dk at in-telegence wrote: I'd love to see something like: # crm_resource -m check_level resource_id .. This should be possible: export OCF_ROOT=/usr/lib/ocf export OCF_RESKEY_your_variable=your_value export OCF_RESKEY_your_variable2=your_value2

Re: [Pacemaker] Need your help in debugging

2009-03-26 Thread Dominik Klein
Priyanka Ranjan wrote: Hi All, i am facing issue in ilo stonith. i have configured ilo stonith in my cluster. it is running fine but it is not stonithing the errant node. in case of failure , the syslong message on DC says that we can't manage this node with same parameters value , i

Re: [Pacemaker] Monitor a resource without the cluster reacting to the result...

2009-03-25 Thread Dominik Klein
foxyc...@yahoo.com wrote: I've been wanting this for some time now and expecting pacemaker would include it in it's newer versions. But I've checked the latest pacemaker 1.0 distribution fresh of the day, and unfortunately have found nothing in it indicating if this is possible. -

Re: [Pacemaker] question related to resource starting

2009-03-24 Thread Dominik Klein
Glory Smith wrote: On Tue, Mar 24, 2009 at 12:16 PM, Dominik Klein d...@in-telegence.net wrote: Glory Smith wrote: Hi All, when we create a resource , how pacemaker choose a node to start resource on it. To be more clear , suppose we have four node cluster , we configure any resource

Re: [Pacemaker] Colocation advice seeked

2009-03-20 Thread Dominik Klein
Hi Actually, I built a system just like that for presentation purpose (so just using Dummy resource, but that doesnt matter) to replace a system that is currently using keepalived. We seem to want to achieve just the same thing. Here's how I did it: # m1 = mysql 1 primitive m1

Re: [Pacemaker] Colocation advice seeked

2009-03-20 Thread Dominik Klein
By default, m1 and m1-ip are on xen-03, m2 and m2-ip are on xen-04. Scores for the ips are m1 xen-03 175 (100 node preference + 75 colocation with m1) m1 xen-04 125 (50 node preference + 75 colocation with m2) m2 xen-03 125 (50 node preference + 75 colocation with m1) m2 xen-04 175 (100 node

Re: [Pacemaker] couple of comments/questions on DRBD HowTo 1.0

2009-03-18 Thread Dominik Klein
Juha Heinanen wrote: Prerequisites section says that DRBD must not be started by init.. In Debian lenny at least, drbd init script load drbd module. if drbs init is not run, drbd modules needs to be loaded by some other means, for example, by adding drbd line to /etc/modules. The RA will

Re: [Pacemaker] how to prevent auto relocation of recources to old primary?

2009-03-18 Thread Dominik Klein
Juha Heinanen wrote: i tried the apache web server example of DRBD HowTo 1.0 with small changes: 1) replaced webserver primitive with mysqlserver primitive 2) removed location primitive, since i don't care which node the resources run. when i shutdown the current primary, the resources

Re: [Pacemaker] how to prevent auto relocation of recources to old primary?

2009-03-18 Thread Dominik Klein
Juha Heinanen wrote: Dominik Klein writes: Sounds like you missed the order and colocation constraints. Please post your configuration. i have order and colocation, but removed location, because i thought that by doing so, the resources will stay where they are working. -- juha

Re: [Pacemaker] how to prevent auto relocation of recources to old primary?

2009-03-18 Thread Dominik Klein
i wonder why the line location ms-drbd0-master-on-xen-1 ms-drbd0 rule role=master 100: #uname eq xen-1 is in the example config, because heartbeat seems to be doing what the line says even without it. The section states that If you want to prefer a node to run the master role (xen-1 in

[Pacemaker] patch: pingd RA

2009-03-10 Thread Dominik Klein
High: RA pingd: Set default ping interval to 1 instead of 0 seconds. Produced high load and traffic. xen-03:~ # cat /proc/loadavg 1.53 1.54 1.47 4/213 6733 xen-03:~ # ps aux|grep pingd root 6735 0.0 0.0 5284 808 pts/1S+ 09:52 0:00 grep pingd root 17399 40.7 0.0 65316

[Pacemaker] showscores.sh for pacemaker 1.0.2

2009-03-03 Thread Dominik Klein
Hi I made the necessary changes to the showscores script to work with pacemaker 1.0.2. Please test and report problems. Has been reported to work by some people and should go into the repository soon. Still, I'd like more people to test and confirm. Important changes: * correctly fetch

Re: [Pacemaker] spilit brain situation

2009-02-06 Thread Dominik Klein
This is not happening in my case. i dont have any stonith configured in my cluster . do i need stonith to handle spilit brain situation. On Fri, Feb 6, 2009 at 1:59 PM, Dominik Klein d...@in-telegence.net wrote: Romi Verma wrote: Thanks for fast reply , Ok, Let me explain the situation. i