Re: [Linux-HA] Master Became Slave - Cluster unstable $$$

2014-04-08 Thread Maloja01
is without any warranty and further support, sorry. Maloja01 On Mon, Apr 7, 2014 at 4:16 PM, Maloja01 wrote: On 04/07/2014 03:00 PM, Ammar Sheikh Saleh wrote: thanks for your help ... can you guide me to the correct commands : I dont understand with is in this command crm(live)n

Re: [Linux-HA] Master Became Slave - Cluster unstable $$$

2014-04-06 Thread Maloja01
FIRST you need to setup fencing (STONITH) - I do not see any stonith resource in your cluster - that WILL be a problem in your cluster. You could not "migrate" a Master/Slave. You Should use "crm_master" to Score the Master-Placement. And you should remove all client-Prefer-location-rules whic

Re: [Linux-HA] How to tell pacemaker to process a new event during a long-running resource operation

2014-03-15 Thread Maloja01
On 03/14/2014 08:50 PM, David Vossel wrote: - Original Message - From: "Maloja01" To: "Linux-HA" Sent: Friday, March 14, 2014 5:32:34 AM Subject: [Linux-HA] How to tell pacemaker to process a new event during a long-running resource operation Hi all, I have a

[Linux-HA] How to tell pacemaker to process a new event during a long-running resource operation

2014-03-14 Thread Maloja01
Hi all, I have a resource which could in special cases have a very long-running start operation. If I have a new event (like switching a standby node back to online) during the already running transition (cluster is still S_TRANSITION_ENGINE) I would like the cluster to process them as soon

Re: [Linux-HA] Resended : Understanding how heartbeat and pacemaker work together

2012-01-13 Thread Maloja01
On 01/13/2012 11:04 AM, Niclas Müller wrote: > I've grouped both as www-services and not it is running like i want. > Change to takeover is 4-6 sec. Its good, but I want to go to 1-3 sec as > far as possible. Much process last will there not because I only made a > Projekt for school with Linux

Re: [Linux-HA] if promote runs into timeout

2012-01-10 Thread Maloja01
on the other hand) not take 10 hours to get the end-customers service working. In such a case even pacemaker could not do anything. Kind regards Fabian > > But nice tip! > > Thx > Erkan :) > > On Sat, Jan 07, 2012 at 10:22:32AM +0100, Maloja01 wrote: >> In an other cust

Re: [Linux-HA] if promote runs into timeout

2012-01-07 Thread Maloja01
In an other customer setup we decided to set a resource to status "unmanaged" when it has to do some special work which should not be interrupted. After the replication (in our case redloogs in a backup db) we set the resource to be managed again. I never have tried to change already triggered tim

Re: [Linux-HA] Q: crm shell: things more complex than "group"

2011-08-19 Thread Maloja01
On 08/18/2011 12:19 PM, Ulrich Windl wrote: > Hi! > > Reading the docs, I learned that pacemaker understands more complex > dependencies than "group" where resources are strictly sequential. For > example one could start a set of resources in parallel, then wait until all > are done, then start

Re: [Linux-HA] Problem with kvm virtual machine and cluster

2011-08-10 Thread Maloja01
The order constraints do work as I assume, but I guess that you run into a pifall: A clone is marked as "up", if one instance in the cluster is started successfully. The order does not say, that the clone on the same node must be up. Kind regards Fabian On 08/10/2011 01:43 PM, i...@umbertocarrar

Re: [Linux-HA] location and orders : Question about a behavior ...

2011-08-05 Thread Maloja01
de3 even quickly, I should have also seen group-2 stopped/restarted > due to the order-group-2 constraint) > > Hope it helps to clarify ... > Thanks again > Alain > > > > De :Maloja01 > A : linux-ha@lists.linux-ha.org > Date : 05/08/2011 11:40 > Obje

Re: [Linux-HA] ocf::LVM monitor needs excessive time to complete

2011-08-05 Thread Maloja01
Hi, processes in state D looks like locked in a kernel call/device request. Do you have a problem with your storage? This is not cluster related . Kind regards Fabian On 08/05/2011 01:55 PM, Ulrich Windl wrote: > Hi, > > we run a cluster that has about 30 LVM VGs that are monitored every minute

Re: [Linux-HA] location and orders : Question about a behavior ...

2011-08-05 Thread Maloja01
On 08/02/2011 05:06 PM, alain.mou...@bull.net wrote: > Hi > > I have this simple configuration of locations and orders between resources > group-1 , group-2 and clone-1 > (on a two nodes ha cluster with Pacemaker-1.1.2-7 /corosync-1.2.3-21) : > > location loc1-group-1 group-1 +100: node2 > loc

Re: [Linux-HA] Antw: Re: location and orders : Question about a behavior ...

2011-08-05 Thread Maloja01
just reopen my first msg in this thread, it would be nice > for me ... Yes you are right - so I will "rewind" the thread beginning from message 1 :) > Thanks a lot anyway. > Alain > > > > De :Maloja01 > A : linux-ha@lists.linux-ha.org > Date : 05/

Re: [Linux-HA] Antw: Re: location and orders : Question about a behavior ...

2011-08-05 Thread Maloja01
On 08/05/2011 08:30 AM, Ulrich Windl wrote: >>>> Maloja01 schrieb am 04.08.2011 um 18:49 in Nachricht > <4e3acd86.1020...@arcor.de>: >> Hi Ulrich, >> >> I did not folow the complete thread, just jumped in - sorry. Is the >> resource inside a r

Re: [Linux-HA] Antw: Re: location and orders : Question about a behavior ...

2011-08-04 Thread Maloja01
: >>>> Maloja01 schrieb am 04.08.2011 um 12:58 in Nachricht > <4e3a7b5c.1030...@arcor.de>: >> On 08/04/2011 08:28 AM, Ulrich Windl wrote: >>> Hi! >>> >>> Isn't the stickyness effectively based on the failcount? We have one >> resource >&g

Re: [Linux-HA] Antw: Re: location and orders : Question about a behavior ...

2011-08-04 Thread Maloja01
On 08/04/2011 08:28 AM, Ulrich Windl wrote: > Hi! > > Isn't the stickyness effectively based on the failcount? We have one resource > that has a location constraint for one node with a weight of 50 and a > sticky ness of 10. The resource runs on a different node and shows no > tendency of

Re: [Linux-HA] string2msg_ll: node [?] failed authentication

2011-08-02 Thread Maloja01
Are there other nodes with the same multicast address? On 08/02/2011 12:38 AM, Hai Tao wrote: > > I reinstalled the OS for node1 (in a two nodes HA, and the node1 had a disk > error), and reconfigured HA. however, after restarting the heartbeat, I see > many errors of " string2msg_ll: node [?] f

Aw: Re: [Linux-HA] Looking for efficient STONITH device - must not be single-point-of-failure

2009-02-10 Thread maloja01
STONITH You could use - ilo system management boards - ipmi system managemt boards - power swiches , You can even run stonith -l to figure out a proper set of stonith devices. And yes you can setup more than one heartbeat link. Just add an other link derictive to /etc/ha.d/ha.cf -

Aw: Re: [Linux-HA] Getting Heartbeat OK

2008-08-30 Thread maloja01
Access rights to the directory? - is the directory available (created)? - Original Nachricht Von: Lars Marowsky-Bree <[EMAIL PROTECTED]> An: General Linux-HA mailing list Datum: 29.08.2008 17:35 Betreff: Re: [Linux-HA] Getting Heartbeat OK > On 2008-08-29T17:23:27, Adrian C

Aw: [Linux-HA] Split-Brain occurrence, what do I do now?

2008-02-15 Thread maloja01
Split-Brain Situations are *very* critical for a two node setup, aspecially when you are using shared media like disks drbd syncs and so on. For bigger clusters the problem is a bit more easy, bevause you get a quorum loss, if half the nodes are "down" or disconnected. You can use the directive

Re: [Linux-HA] quorumd: Problem with certificates

2008-02-07 Thread maloja01
Did you use the correct cn (certificate attribute cn must be equal to the cluster name)? If you use the cluster name "mycluster" and your quorum server could be reached with a special name (dont remeber it know, but you can strace it easyly) you can also use quorumdtest as a clien test program to

[Linux-HA] Q: Difference between after and before order rules

2008-02-07 Thread maloja01
Hi all, what's the defined difference between the two order rules 1: A before B 2: B after A For normal operation I guess these rules are odering the same start sequence. But is there a difference, if A or B are failing (during start or operation)? Regards Fabian

Re: [Linux-HA] pingd, quorum, split-brain... should I give up?

2007-10-23 Thread maloja01
Riccardo Perni schrieb: > > > Andrew Beekhof <[EMAIL PROTECTED]> ha scritto: > >> On 10/22/07, Riccardo Perni <[EMAIL PROTECTED]> wrote: >>> >> Is it possible >>> >> to handle this situation? >>> > >>> > You may try quorumd. See >>> > >>> > http://www.linux-ha.org/QuorumServerGuide >>> >>> I'm g

[Linux-HA] I/O fencing method?

2007-10-23 Thread maloja01
I am searching for an I/O fencing method like SCSI(3) reservation. Is there any method implemented yet for use with heartbeat to avoid accidently mount multiple times the same file system from diffrent nodes? Of course I could configure heartbeat not to mount twice and I could use a quorum server

[Linux-HA] Online extention of the cluster

2007-07-13 Thread maloja01
Is it possible to extent a running cluster with new cluster nodes? The extention should be done without any stop of any resource placed on nodes, which are running in the cluster before we extend the cluster. If it is possible, can I use the "is_managed" attribute to leave the resources untouc

[Linux-HA] Online extention of the cluster (ho üpefuly not a duplicate mail)

2007-07-13 Thread maloja01
I hope my email is not shipped twice, but my last mail seams not to recive the list. My messge was: Is it possible to extent a running cluster with new cluster nodes? The extention should be done without any stop of any resource placed on nodes, which are running in the cluster before we extend