Re: [ClusterLabs] Error "xml does not conform to the schema" upon "pcs cluster standby" command

2016-06-03 Thread Klaus Wenninger
mycluster"/> > > > [root@airv_cu root]# pcs status > Cluster name: > > -Regards > Nikhil > > > On Fri, Jun 3, 2016 at 4:46 PM, Klaus Wenninger <kwenn...@redhat.com > <mailto:kwenn...@redhat.com>> wrote: > > On 06/03/20

Re: [ClusterLabs] Apache Active Active Balancer without FileSystem Cluster

2016-06-13 Thread Klaus Wenninger
On 06/13/2016 02:33 PM, alan john wrote: > Dear All, > > I am trying to setup an Apache active-active cluster. I do not wish > to have common file system for both nodes. it i However I do not like > to have pcs/corosync to start or stop apache, but monitor it and move > only VIP to secondary

Re: [ClusterLabs] Alert notes

2016-06-15 Thread Klaus Wenninger
On 06/15/2016 06:11 PM, Ferenc Wágner wrote: > Hi, > > Please find some random notes about my adventures testing the new alert > system. > > The first alert example in the documentation has no recipient: > > > > In the example above, the cluster will call my-script.sh for each >

Re: [ClusterLabs] Using pacemaker for manual failover only?

2016-05-27 Thread Klaus Wenninger
ything is reacting as desired ... > > On Tue, May 24, 2016 at 11:02 AM, Ken Gaillot <kgail...@redhat.com > <mailto:kgail...@redhat.com>> wrote: > > On 05/24/2016 04:13 AM, Klaus Wenninger wrote: > > On 05/24/2016 09:50 AM, Jehan-Guillaume de Rorthais

Re: [ClusterLabs] design question to DRBD

2016-06-22 Thread Klaus Wenninger
On 06/22/2016 11:17 PM, Lentes, Bernd wrote: > > Ursprüngliche Nachricht > Von: Dimitri Maziuk > Datum: 22.06.2016 21:23 (GMT+01:00) > An: users@clusterlabs.org > Betreff: Re: [ClusterLabs] design question to DRBD > > On 06/22/2016 02:13 PM, Lentes, Bernd

Re: [ClusterLabs] Antw: Alert notes

2016-06-24 Thread Klaus Wenninger
On 06/24/2016 09:16 AM, Ulrich Windl wrote: Ferenc Wágner schrieb am 15.06.2016 um 18:11 in Nachricht > <87vb1a5t4k@lant.ki.iif.hu>: >> Hi, >> > [...] >> The SNMP agent seems to have a problem with hrSystemDate, which should >> be an OCTETSTR with strict format, not some

Re: [ClusterLabs] problems with a CentOS7 SBD cluster

2016-06-28 Thread Klaus Wenninger
On 06/28/2016 11:24 AM, Marcin Dulak wrote: > > > On Tue, Jun 28, 2016 at 5:04 AM, Andrew Beekhof > wrote: > > On Sun, Jun 26, 2016 at 6:05 AM, Marcin Dulak > > wrote: > > Hi, >

Re: [ClusterLabs] Fwd: Getting error when building Pacemaker-1.1 from source

2016-03-01 Thread Klaus Wenninger
I would strongly recommend installing everything at the final destination - in a changeroot build environment if you don't want to taint your build-host. Regards, Klaus On 03/02/2016 03:46 AM, Sharat Joshi wrote: > Hi List Folk, > > I am very new to Pacemaker and I am trying to build using

Re: [ClusterLabs] service network restart and corosync

2016-03-02 Thread Klaus Wenninger
On 03/03/2016 03:08 AM, Debabrata Pani wrote: > Hi, > > In our deployment, due to some requirement, we need to do a : > service network restart > > Due to this corosync crashes and the associated pacemaker processes crash > as well. > > As per the last comment on this issue, > --- > Corosync

Re: [ClusterLabs] Antw: Coming in 1.1.15: Event-driven alerts

2016-04-22 Thread Klaus Wenninger
On 04/22/2016 08:16 AM, Ulrich Windl wrote: Ken Gaillot schrieb am 21.04.2016 um 19:50 in Nachricht > <571912f3.2060...@redhat.com>: > > [...] >> The alerts section can have any number of alerts, which look like: >> >>>

Re: [ClusterLabs] ClusterLabsComing in 1.1.15: Event-driven alerts

2016-04-22 Thread Klaus Wenninger
On 04/22/2016 10:55 AM, Ferenc Wágner wrote: > Ken Gaillot writes: > >> Each alert may have any number of recipients configured. These values >> will simply be passed to the script as arguments. The first recipient >> will also be passed as the CRM_alert_recipient environment

Re: [ClusterLabs] Monitoring action of Pacemaker resources fail because of high load on the nodes

2016-04-22 Thread Klaus Wenninger
On 04/22/2016 03:29 PM, John Gogu wrote: > Hello community, > I am facing following situation with a Pacemaker 2 nodes DB cluster > (3 resources configured into the cluster - 1 MySQL DB resource, 1 > Apache resource, 1 IP resource ) > -at every 61 seconds an MySQL monitoring action is started and

Re: [ClusterLabs] Antw: Re: Coming in 1.1.15: Event-driven alerts

2016-04-28 Thread Klaus Wenninger
Regards, > Ulrich > >>>> Klaus Wenninger <kwenn...@redhat.com> schrieb am 27.04.2016 um 20:14 in > Nachricht <57210183.6050...@redhat.com>: >> On 04/27/2016 04:19 PM, renayama19661...@ybb.ne.jp wrote: >>> Hi All, >>> >>> We

Re: [ClusterLabs] [ClusterLab] : Unable to bring up pacemaker

2016-04-28 Thread Klaus Wenninger
On 04/27/2016 05:28 PM, Sriram wrote: > Dear All, > > I m trying to use pacemaker and corosync for the clustering > requirement that came up recently. > We have cross compiled corosync, pacemaker and pcs(python) for ppc > environment (Target board where pacemaker and corosync are supposed to >

Re: [ClusterLabs] Simple clarification regarding pacemaker

2016-04-27 Thread Klaus Wenninger
On 04/26/2016 09:09 PM, K Aravind wrote: > Thank you for the quick responses :) > I starting to understand :) > One more quick question. > Let's say I have a 2 node cluster with stonith-enabled=false and no > quorum policy = ignore > And master-max=1 > Now connection between nodes went down. >

Re: [ClusterLabs] Coming in 1.1.15: Event-driven alerts

2016-04-27 Thread Klaus Wenninger
On 04/27/2016 12:12 PM, Kristoffer Grönlund wrote: > Ken Gaillot writes: > >> The most prominent feature will be Klaus Wenninger's new implementation >> of event-driven alerts -- the ability to call scripts whenever >> interesting events occur (nodes joining/leaving,

Re: [ClusterLabs] Coming in 1.1.15: Event-driven alerts

2016-04-25 Thread Klaus Wenninger
On 04/25/2016 08:03 AM, Kristoffer Grönlund wrote: > Ken Gaillot writes: > >> Hello everybody, >> >> The release cycle for 1.1.15 will be started soon (hopefully tomorrow)! >> >> The most prominent feature will be Klaus Wenninger's new implementation >> of event-driven alerts

Re: [ClusterLabs] Monitoring action of Pacemaker resources fail because of high load on the nodes

2016-04-26 Thread Klaus Wenninger
On 04/26/2016 06:04 AM, Ken Gaillot wrote: > On 04/25/2016 10:23 AM, Dmitri Maziuk wrote: >> On 2016-04-24 16:20, Ken Gaillot wrote: >> >>> Correct, you would need to customize the RA. >> Well, you wouldn't because your custom RA will be overwritten by the >> next RPM update. > Correct again :) >

Re: [ClusterLabs] Antw: Re: Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-05-20 Thread Klaus Wenninger
On 05/20/2016 08:39 AM, Ulrich Windl wrote: Jehan-Guillaume de Rorthais schrieb am 19.05.2016 um 21:29 in > Nachricht <20160519212947.6cc0fd7b@firost>: > [...] >> I was thinking of a use case where a graceful demote or stop action failed >> multiple times and to give a

Re: [ClusterLabs] Using pacemaker for manual failover only?

2016-05-24 Thread Klaus Wenninger
On 05/24/2016 09:50 AM, Jehan-Guillaume de Rorthais wrote: > Le Tue, 24 May 2016 01:53:22 -0400, > Digimer a écrit : > >> On 23/05/16 03:03 PM, Stephano-Shachter, Dylan wrote: >>> Hello, >>> >>> I am using pacemaker 1.1.14 with pcs 0.9.149. I have successfully >>> configured

Re: [ClusterLabs] Two related Cluster

2016-05-17 Thread Klaus Wenninger
On 05/17/2016 08:20 AM, ‪H Yavari‬ ‪ wrote: > Hi, > > Emm I have a scenario and I'm confused. So I'm searching for the > solutions. Can you please check this > http://clusterlabs.org/pipermail/users/2016-April/002796.html > > I don't know how achieve to this? with Booth? with attribute? 2 >

Re: [ClusterLabs] Coming in 1.1.15: Event-driven alerts

2016-05-13 Thread Klaus Wenninger
o open-source >> clustering welcomed <users@clusterlabs.org> >> Cc: >> Date: 2016/5/12, Thu 06:28 >> Subject: Re: [ClusterLabs] Coming in 1.1.15: Event-driven alerts >> >> Hi Klaus, >> >> Thank you for comment. >> >> I confi

Re: [ClusterLabs] [ClusterLab] : Corosync not initializing successfully

2016-05-02 Thread Klaus Wenninger
As your hardware is probably capable of running ppcle and if you have an environment at hand without too much effort it might pay off to try that. There are of course distributions out there support corosync on big-endian architectures but I don't know if there is an automatized regression for

Re: [ClusterLabs] Unable to run Pacemaker: pcmk_child_exit

2016-05-06 Thread Klaus Wenninger
On 05/06/2016 12:40 PM, Nikhil Utane wrote: > Hi, > > I used the blackbox feature which showed the reason for failure. > As I am cross-compiling pacemaker on a build machine and later moving > the binaries to the target, few binaries were missing. After fixing > that and bunch of other

Re: [ClusterLabs] dropping ssh connection on failover

2016-04-15 Thread Klaus Wenninger
On 04/15/2016 04:59 PM, Dmitri Maziuk wrote: > On 2016-04-15 07:46, Klaus Wenninger wrote: > >> Which IP-address did you use to ssh to that box? One controlled >> by pacemaker and possibly being migrated or a fixed one assigned >> to that box? > > Good try bu

Re: [ClusterLabs] Q: Resource balancing opration

2016-04-20 Thread Klaus Wenninger
On 04/20/2016 08:17 AM, Ulrich Windl wrote: > Hi! > > I'm wondering: If you boot a node on a cluster, most resources will go to > another node (if possible). Due to stickiness configured, those resources > will stay there. > So I'm wondering whether or how I could cause a rebalance of resources

Re: [ClusterLabs] Moving Related Servers

2016-04-20 Thread Klaus Wenninger
9376656 > > http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm140617356537136 > >> >> *From:* Klaus Wenninger <kwenn...@redhat.com> >> *To:* users@clusterlabs

Re: [ClusterLabs] pacemaker apache and umask on CentOS 7

2016-04-20 Thread Klaus Wenninger
On 04/20/2016 05:35 PM, fatcha...@gmx.de wrote: > >> Gesendet: Mittwoch, 20. April 2016 um 16:31 Uhr >> Von: "Klaus Wenninger" <kwenn...@redhat.com> >> An: users@clusterlabs.org >> Betreff: Re: [ClusterLabs] pacemaker apache and umask on CentOS 7 >

Re: [ClusterLabs] Suicide stonith

2016-04-15 Thread Klaus Wenninger
On 04/15/2016 02:21 PM, Digimer wrote: > On 15/04/16 03:12 AM, Andrey Rogovsky wrote: >> Hi >> How to suicide works? >> I want to reboot node when it lost networking and get out from cluster. > Stonith can't really on the failed/lost node behaving in any predictable > way. So self-fencing, ssh

Re: [ClusterLabs] dropping ssh connection on failover

2016-04-15 Thread Klaus Wenninger
On 04/15/2016 02:22 PM, Digimer wrote: > On 14/04/16 05:38 PM, Dimitri Maziuk wrote: >> Hi all, >> >> I've an up to date centos 7 w/ pcs + corosync + pacemaker active-passive >> cluster running drbd (8.4) and apache. It's all working fine except when >> I trigger a fail-over, my ssh connection to

Re: [ClusterLabs] Coming in 1.1.15: Event-driven alerts

2016-05-09 Thread Klaus Wenninger
Preparing the updated Pacemaker-Explained-section I read through this again. I guess most points where this differs from the actual implementation - currently in the repo - were discussed already but where I remember them I'll insert them here anyway to have them in one place. Reason why I wrote

Re: [ClusterLabs] Reliability questions on the new QDevices in uneven node count Setups

2016-07-25 Thread Klaus Wenninger
On 07/25/2016 04:56 PM, Thomas Lamprecht wrote: > Thanks for the fast reply :) > > > On 07/25/2016 03:51 PM, Christine Caulfield wrote: >> On 25/07/16 14:29, Thomas Lamprecht wrote: >>> Hi all, >>> >>> I'm currently testing the new features of corosync 2.4, especially >>> qdevices. >>> First tests

Re: [ClusterLabs] Pacemaker not always selecting the right stonith device

2016-07-21 Thread Klaus Wenninger
On 07/21/2016 06:40 PM, Andrei Borzenkov wrote: > 19.07.2016 18:24, Klaus Wenninger пишет: >> On 07/19/2016 04:17 PM, Ken Gaillot wrote: >>> On 07/19/2016 09:00 AM, Andrei Borzenkov wrote: >>>> On Tue, Jul 19, 2016 at 4:52 PM, Ken Gaillot <kgail...@redhat.com&

Re: [ClusterLabs] Can Pacemaker monitor geographical separated servers

2016-08-10 Thread Klaus Wenninger
On 08/10/2016 04:43 PM, Jason A Ramsey wrote: > > I can’t answer all of this because I’m still working out how to do > fencing, but I’ve been setting up a Pacemaker cluster in Amazon Web > Services across two separate availability zones. Naturally, this means > that I have to bridge subnets, so

Re: [ClusterLabs] Antw: Coming in 1.1.16: versioned resource parameters

2016-08-11 Thread Klaus Wenninger
On 08/11/2016 09:13 AM, Ulrich Windl wrote: Ken Gaillot schrieb am 10.08.2016 um 22:36 in Nachricht > <804dd911-56a6-328c-00a4-43133f59d...@redhat.com>: >> Have you ever changed a resource agent in a backward-incompatible way, >> and found yourself wishing you

Re: [ClusterLabs] Pacemaker not always selecting the right stonith device

2016-07-20 Thread Klaus Wenninger
On 07/19/2016 06:54 PM, Andrei Borzenkov wrote: > 19.07.2016 19:01, Andrei Borzenkov пишет: >> 19.07.2016 18:24, Klaus Wenninger пишет: >>> On 07/19/2016 04:17 PM, Ken Gaillot wrote: >>>> On 07/19/2016 09:00 AM, Andrei Borzenkov wrote: >>>>> On Tue

Re: [ClusterLabs] ocf:heartbeat:apache does not start

2016-07-13 Thread Klaus Wenninger
On 07/13/2016 09:24 AM, Heiko Reimer wrote: > > Am 13.07.2016 um 09:09 schrieb Li Junliang: >> 在 2016-07-13三的 08:59 +0200,Heiko Reimer写道: >>> Hi, >>> >>> i try to setup pacemaker apache resource with ocf:heartbeat:apache. >>> But >>> when pacemaker try to start the resource i get >>> >>> Failed

Re: [ClusterLabs] system or ocf resource agent

2016-07-08 Thread Klaus Wenninger
On 07/08/2016 06:15 PM, Ken Gaillot wrote: > On 07/08/2016 05:10 AM, Heiko Reimer wrote: >> Hi, >> >> i am setting up new debian 8 ha cluster with drbd, corosync and >> pacemaker with apache and mysql. In my old environment i had configured >> resources with ocf resource agents. Now i have seen

Re: [ClusterLabs] fence_vmware_soap: fail to shutdown VMs

2016-07-11 Thread Klaus Wenninger
On 07/11/2016 12:35 PM, Marek Grac wrote: > Hi, > > 90MB of logs are not a big deal, most of them will just attempt to do > same request again and again. Feel, free to send me a link to this file. > > If you have python-suds then it should be enough, you may try a > different version of this

Re: [ClusterLabs] Limiting number of nodes that can join into a cluster

2016-06-28 Thread Klaus Wenninger
On 06/28/2016 01:18 PM, Nikhil Utane wrote: > Hi, > > I want to limit the number of nodes that can form a cluster to single > digit (say 6). I can do it using application-level logic but would > like to know if there is any option in Corosync that would do it for > me. (Didn't find one). Going

Re: [ClusterLabs] lrmd segfault

2017-01-31 Thread Klaus Wenninger
On 01/31/2017 03:12 PM, ale...@kurnosov.spb.ru wrote: > As i said, we used rpm from standard repo, hardly it compiled incorrectly. > And according > to a spec L5630 (the node's CPU) has SSE4.2 support. And in that case it > should be > illegal instruction exception, not segfault. ... and it is

Re: [ClusterLabs] Fence agent for VirtualBox

2017-02-06 Thread Klaus Wenninger
Maybe you need some mapping between vbox-guest-names and pacemaker node-names? (attribute pcmk_host_map) That you are writing that you added the script as fence_virtual is probably a typo in the mail ... and would probably create a different error message ... On 02/06/2017 07:06 PM, Jihed

Re: [ClusterLabs] Fence agent for VirtualBox

2017-02-06 Thread Klaus Wenninger
n Mon, Feb 6, 2017, 3:20 PM Klaus Wenninger <kwenn...@redhat.com > <mailto:kwenn...@redhat.com>> wrote: > > No experience with fencing vbox-VMs on my side either ... > But as always when there is no physical fencing-device > available sbd might be a way to

Re: [ClusterLabs] Fence agent for VirtualBox

2017-02-06 Thread Klaus Wenninger
No experience with fencing vbox-VMs on my side either ... But as always when there is no physical fencing-device available sbd might be a way to go - either with just a watchdog (guess vbox offers a virtual watchdog that is supported by the linux-kernel or at least if you install the guest-support

Re: [ClusterLabs] Fence agent for VirtualBox

2017-02-23 Thread Klaus Wenninger
On 02/23/2017 07:48 AM, Marek Grac wrote: > Hi, > > we have added support for a host with Windows but it is not trivial to > setup because of various contexts/privileges. > > Install openssh on Windows (tutorial can be found on >

Re: [ClusterLabs] SBD with shared block storage (and watchdog?)

2017-02-13 Thread Klaus Wenninger
On 02/13/2017 07:34 PM, dur...@mgtsciences.com wrote: > emmanuel segura wrote on 02/13/2017 10:55:58 AM: > > > From: emmanuel segura > > To: Cluster Labs - All topics related to open-source clustering > > welcomed > > Date:

Re: [ClusterLabs] SBD with shared block storage (and watchdog?)

2017-02-13 Thread Klaus Wenninger
On 02/13/2017 06:34 PM, dur...@mgtsciences.com wrote: > I am working to get an active/active cluster running. > I have Windows 10 running 2 Fedora 25 Virtualbox VMs. > VMs named node1, and node2. > > I created a vdi disk and set it to shared. > I formatted it to gfs2 with this command. > >

Re: [ClusterLabs] Failed reload

2017-02-09 Thread Klaus Wenninger
On 02/09/2017 05:08 PM, Ken Gaillot wrote: > On 02/08/2017 02:15 AM, Ferenc Wágner wrote: >> Hi, >> >> There was an interesting discussion on this list about "Doing reload >> right" last July (which I still haven't digested entirely). Now I've >> got a related question about the current and

Re: [ClusterLabs] I question whether STONITH is working.

2017-02-16 Thread Klaus Wenninger
On 02/15/2017 10:30 PM, Ken Gaillot wrote: > On 02/15/2017 12:17 PM, dur...@mgtsciences.com wrote: >> I have 2 Fedora VMs (node1, and node2) running on a Windows 10 machine >> using Virtualbox. >> >> I began with this. >>

Re: [ClusterLabs] Disabled resource is hard logging

2017-02-16 Thread Klaus Wenninger
rc. But I can't tell you out of my mind how that worked - there was a discussion a few weeks ago on the list iirc. Regards, Klaus > > Thanks a lot! > > > El 16 feb. 2017 10:57 a. m., "Klaus Wenninger" <kwenn...@redhat.com > <mailto:kwenn...@redhat.com>>

Re: [ClusterLabs] I question whether STONITH is working.

2017-02-16 Thread Klaus Wenninger
On 02/16/2017 05:42 PM, dur...@mgtsciences.com wrote: > Klaus Wenninger <kwenn...@redhat.com> wrote on 02/16/2017 03:27:07 AM: > > > From: Klaus Wenninger <kwenn...@redhat.com> > > To: kgail...@redhat.com, Cluster Labs - All topics related to open- > &g

Re: [ClusterLabs] Disabled resource is hard logging

2017-02-16 Thread Klaus Wenninger
/nfs-vdic-mgmt-vm/vdicsunstone01.xml does not exist or is not > readable. > VirtualDomain(vm-vdicone01)[73742]: 2017/02/16_16:43:40 INFO: > Configuration file /mnt/nfs-vdic-mgmt-vm/vdicone01.xml not > readable, resource considered stopped. > VirtualDo

Re: [ClusterLabs] get status of each RG

2017-01-16 Thread Klaus Wenninger
On 01/16/2017 04:10 PM, Ken Gaillot wrote: > On 01/15/2017 10:02 AM, Florin Portase wrote: >> Hello, >> >> We're about to create some HPOM (HP Operations Manager ) monitoring >> policies for RHEL7 cluster environment. >> >> However, it looks like getting status of running defined RG seems way to

Re: [ClusterLabs] corosync/pacemaker on ~100 nodes cluser

2016-08-23 Thread Klaus Wenninger
On 08/23/2016 06:26 PM, Radoslaw Garbacz wrote: > Hi, > > I would like to ask for settings (and hardware requirements) to have > corosync/pacemaker running on about 100 nodes cluster. Actually I had thought that 16 would be the limit for full pacemaker-cluster-nodes. For larger deployments

Re: [ClusterLabs] When the DC crmd is frozen, cluster decisions are delayed infinitely

2016-09-05 Thread Klaus Wenninger
On 09/03/2016 08:42 PM, Shermal Fernando wrote: > > Hi, > > > > Currently our system have 99.96% uptime. But our goal is to increase > it beyond 99.999%. Now we are studying the > reliability/performance/features of pacemaker to replace the existing > clustering solution. > > > > While testing

Re: [ClusterLabs] Failover IP with Monitoring but not controling the colocated services.

2016-09-05 Thread Klaus Wenninger
On 09/05/2016 01:38 PM, Stefan Schörghofer wrote: > Hi List, > > I am currently trying to setup the following situation in my lab: > > |--Cluster IP--| > | HAProxy instances |HAProxy instances | > | Node 1| Node 2 | > > > > Now

Re: [ClusterLabs] ip clustering strange behaviour

2016-09-05 Thread Klaus Wenninger
t gracefully shut pacemaker on node2. > > > Now I restarted, everything was up, stopped pacemaker service on > > > host2 and I got host1 with both IPs configured. ;) > > > > > > But, though I understand that if I halt host2 with no grace > shut of > > > pacemaker, it w

Re: [ClusterLabs] ip clustering strange behaviour

2016-09-05 Thread Klaus Wenninger
//www.gabrielebulfon.com/> > *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon > > > > -- > > Da: Klaus Wenninger <kwenn...@redhat.com> > A: users@clusterlabs.org > Data: 5 settembre 2016 12.21.25 CEST > O

Re: [ClusterLabs] ip clustering strange behaviour

2016-09-01 Thread Klaus Wenninger
don't expect host1 > > to loose its own IP! Why? > > > > Gabriele > > > > > > ---- > > *Sonicle S.r.l. *: http://www.sonicle.com <http://www.sonicle.com/> >

Re: [ClusterLabs] Antw: Re: When the DC crmd is frozen, cluster decisions are delayed infinitely

2016-09-08 Thread Klaus Wenninger
On 09/08/2016 10:58 AM, Shermal Fernando wrote: > Hi Jehan-Guillaume, > > Does this means watchdog will serf-terminate the machine when the crm daemon > is frozen? Would be desirable but doesn't seem to happen - at least till now - will see what I can do on that front. > > Regards, > Shermal

Re: [ClusterLabs] Antw: Re: When the DC crmd is frozen, cluster decisions are delayed infinitely

2016-09-08 Thread Klaus Wenninger
On 09/08/2016 08:55 AM, Digimer wrote: > On 08/09/16 03:47 PM, Ulrich Windl wrote: > Shermal Fernando schrieb am 08.09.2016 um > 06:41 in >> Nachricht >> <8ce6e8d87f896546b9c65ed80d30a4336578c...@lg-spmb-mbx02.lseg.stockex.local>: >>> The whole cluster will

Re: [ClusterLabs] Antw: Re: Antw: Re: When the DC crmd is frozen, cluster decisions are delayed infinitely

2016-09-08 Thread Klaus Wenninger
On 09/08/2016 02:28 PM, Ulrich Windl wrote: >>>> Klaus Wenninger <kwenn...@redhat.com> schrieb am 08.09.2016 um 09:13 in > Nachricht <4c828344-44da-1d93-b43f-a305cfaa5...@redhat.com>: >> On 09/08/2016 08:55 AM, Digimer wrote: >>> On 08/09/16 03:47 PM, Ul

Re: [ClusterLabs] ip clustering strange behaviour

2016-08-30 Thread Klaus Wenninger
cle.com <http://www.sonicle.com/> > *Music: *http://www.gabrielebulfon.com > <http://www.gabrielebulfon.com/> > *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon > > > > >

Re: [ClusterLabs] Pacemaker Active-Active setup monitor problem

2016-09-12 Thread Klaus Wenninger
On 09/12/2016 12:55 PM, Alex wrote: > Hi all, > > I am having a problem with one of our pacemaker clusters that is > running in an active-active configuration. > > Sometimes the Website monitor will timeout, triggering and apache > restart that fails. That will increase the fail-count to INFINITY

Re: [ClusterLabs] Pacemaker quorum behavior

2016-09-09 Thread Klaus Wenninger
st, Poughkeepsie, N.Y. > INTERNET: swgre...@us.ibm.com > PHONE: 8/293-7301 (845-433-7301) M/S: POK 42HA/P966 > > > Inactive hide details for Klaus Wenninger ---09/08/2016 10:59:27 > AM---On 09/08/2016 03:55 PM, Scott Greenlese wrote: >Klaus Wenninger > ---09/08/2016 10:59:27

Re: [ClusterLabs] Colocation and ordering with live migration

2016-10-10 Thread Klaus Wenninger
On 10/10/2016 10:17 AM, Pavel Levshin wrote: > Hello. > > We are trying to migrate our services to relatively fresh version of > cluster software. It is RHEL 7 with pacemaker 1.1.13-10. I’ve faced a > problem when live migration of virtual machines is allowed. In short, > I need to manage

Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: Re: When the DC crmd is frozen, cluster decisions are delayed infinitely

2016-10-10 Thread Klaus Wenninger
maker. Always helpful to discuss/clarify an idea once some code is available ... > Was the discussion of using WD service over so far? Not from my pov. Just a day off ;-) > > > Best Regard, > Hideo Yamauchi. > > > - Original Message - >> From: Klaus Wenninger

Re: [ClusterLabs] Colocation and ordering with live migration

2016-10-10 Thread Klaus Wenninger
On 10/10/2016 02:00 PM, Pavel Levshin wrote: > > 10.10.2016 14:32, Klaus Wenninger: >> Why are the order-constraints between libvirt & vms optional? > > If they were mandatory, then all the virtual machines would be > restarted when libvirtd restarts. This is not

Re: [ClusterLabs] RFC: allowing soft recovery attempts before ignore/block/etc.

2016-09-21 Thread Klaus Wenninger
On 09/20/2016 10:25 PM, Ken Gaillot wrote: > Hi everybody, > > Currently, Pacemaker's on-fail property allows you to configure how the > cluster reacts to operation failures. The default "restart" means try to > restart on the same node, optionally moving to another node once > migration-threshold

Re: [ClusterLabs] where do I find the null fencing device?

2016-09-19 Thread Klaus Wenninger
On 09/17/2016 04:35 PM, Dan Swartzendruber wrote: > > I wanted to do some experiments, and the null fencing agent seemed to > be just what I wanted. I don't find it anywhere, even after > installing fence-agents-all and cluster-glue (this is on CentOS 7, > btw...) Thanks... > Depending on what

Re: [ClusterLabs] Antw: Re: When the DC crmd is frozen, cluster decisions are delayed infinitely

2016-09-22 Thread Klaus Wenninger
If we are finally facing an issue I'd herewith like to ask for input. > > Best Regards, > Hideo Yamauchi. > > > > - Original Message - >> From: Klaus Wenninger <kwenn...@redhat.com> >> To: users@clusterlabs.org >> Cc: >> Date: 2016/9/21,

Re: [ClusterLabs] systemd RA start/stop delays

2016-08-18 Thread Klaus Wenninger
On 08/18/2016 03:17 AM, TEG AMJG wrote: > Hi > > I am having a problem with a simple Active/Passive cluster which > consists in the next configuration > > Cluster Name: kamcluster > Corosync Nodes: > kam1vs3 kam2vs3 > Pacemaker Nodes: > kam1vs3 kam2vs3 > > Resources: > Resource: ClusterIP

Re: [ClusterLabs] systemd RA start/stop delays

2016-08-18 Thread Klaus Wenninger
On 08/18/2016 04:00 PM, Ken Gaillot wrote: > On 08/17/2016 08:17 PM, TEG AMJG wrote: >> Hi >> >> I am having a problem with a simple Active/Passive cluster which >> consists in the next configuration >> >> Cluster Name: kamcluster >> Corosync Nodes: >> kam1vs3 kam2vs3 >> Pacemaker Nodes: >>

Re: [ClusterLabs] pacemakerd quits after few seconds with some errors

2016-08-22 Thread Klaus Wenninger
On 08/23/2016 12:20 AM, Ken Gaillot wrote: > On 08/22/2016 12:17 PM, Gabriele Bulfon wrote: >> Hi, >> >> I built corosync/pacemaker for our XStreamOS/illumos : corosync starts >> fine and log correctly, pacemakerd quits after some seconds with the >> attached log. >> Any idea where is the issue? >

Re: [ClusterLabs] data loss of network would cause Pacemaker exit abnormally

2016-08-29 Thread Klaus Wenninger
On 08/28/2016 04:15 AM, chenhj wrote: > Hi all, > > When i use the following command to simulate data lost of network at > one member of my 3 nodes Pacemaker+Corosync cluster, > sometimes it cause Pacemaker on another node exit. > > tc qdisc add dev eth2 root netem loss 90% > > Is there any

Re: [ClusterLabs] ip clustering strange behaviour

2016-08-29 Thread Klaus Wenninger
On 08/29/2016 05:18 PM, Gabriele Bulfon wrote: > Hi, > > now that I have IPaddr work, I have a strange behaviour on my test > setup of 2 nodes, here is my configuration: > > ===STONITH/FENCING=== > > primitive xstorage1-stonith stonith:external/ssh-sonicle op monitor > interval="25" timeout="25"

Re: [ClusterLabs] corosync.log is 5.1GB in a short period

2016-08-23 Thread Klaus Wenninger
On 08/23/2016 08:31 AM, Kristoffer Grönlund wrote: > 朱荣 writes: > >> Hello: >> I has a problem about corosync log, my corosync log is increase to 5.1GB in >> a short time. >> Then I check the corosync log, it’s show me the same message in short >> period,like the attachment.

Re: [ClusterLabs] pacemaker validate-with

2016-08-24 Thread Klaus Wenninger
On 08/24/2016 04:17 PM, Gabriele Bulfon wrote: > Hi, > > now I've got my pacemaker 1.1.14 and corosync 2.4.1 working together > running an empty configuration of just 2 nodes. > So I run crm configure but I get this error: > > ERROR: CIB not supported: validator 'pacemaker-2.4', release '3.0.10' >

Re: [ClusterLabs] ocf scripts shell and local variables

2016-08-29 Thread Klaus Wenninger
On 08/29/2016 03:47 PM, Ken Gaillot wrote: > On 08/29/2016 04:17 AM, Gabriele Bulfon wrote: >> Hi Ken, >> >> I have been talking with the illumos guys about the shell problem. >> They all agreed that ksh (and specially the ksh93 used in illumos) is >> absolutely Bourne-compatible, and that the

Re: [ClusterLabs] Howto restart resource

2016-08-29 Thread Klaus Wenninger
On 08/29/2016 04:03 PM, Ken Gaillot wrote: > On 08/29/2016 01:38 AM, Stefano Ruberti wrote: >> Dear all, >> >> I have following situation and I need an advice from you: >> >> in my Active/Passive Cluster (Ubuntu_16.04 corosync + pacemaker , no pcs) >> >> Node_ANode_B >> Resource1

Re: [ClusterLabs] Is it possible to sign up for cluster events from Pacemaker?

2016-09-26 Thread Klaus Wenninger
> > Thanks for the answer. > > I also was hoping to hear that I can do the case from c++ code. > > Thank you, > Kostia > > On Mon, Sep 26, 2016 at 1:59 PM, Klaus Wenninger > <kwenn...@redhat.com <mailto:kwenn...@redhat.com>>

Re: [ClusterLabs] Is it possible to sign up for cluster events from Pacemaker?

2016-09-26 Thread Klaus Wenninger
On 09/26/2016 12:29 PM, Kostiantyn Ponomarenko wrote: > Hi, > > I am wondering if it is possible to sing up for cluster events from > Pacemaker? Something like: > - a node joins/leaves the cluster, > - a resource fails, > - a resources moves, > - etc. Sounds like a use case for the

Re: [ClusterLabs] Establishing Timeouts

2016-10-10 Thread Klaus Wenninger
On 10/10/2016 06:58 PM, Eric Robinson wrote: > Thanks for the clarification. So what's the easiest way to ensure that the > cluster waits a desired timeout before deciding that a re-convergence is > necessary? By raising the token (lost) timeout I would say. Please correct my (Chrissie) but I

Re: [ClusterLabs] Establishing Timeouts

2016-10-10 Thread Klaus Wenninger
On 10/10/2016 08:35 PM, Eric Robinson wrote: > Basically, when we turn off a switch, I want to keep the cluster from failing > over before Linux bonding has had a chance to recover. > > I'm mostly interested in prventing false-positive cluster failovers that > might occur during manual network

Re: [ClusterLabs] Colocation and ordering with live migration

2016-10-10 Thread Klaus Wenninger
On 10/10/2016 06:56 PM, Ken Gaillot wrote: > On 10/10/2016 10:21 AM, Klaus Wenninger wrote: >> On 10/10/2016 04:54 PM, Ken Gaillot wrote: >>> On 10/10/2016 07:36 AM, Pavel Levshin wrote: >>>> 10.10.2016 15:11, Klaus Wenninger: >>>>> On 10/10/2016 02:00

Re: [ClusterLabs] Antw: Resources wont start on new node unless it is the only active node

2016-11-09 Thread Klaus Wenninger
On 11/09/2016 09:33 AM, Ulrich Windl wrote: Ryan Anstey schrieb am 08.11.2016 um 19:54 in Nachricht > : >> I've been running a ceph cluster with pacemaker for a few months now. >> Everything has

Re: [ClusterLabs] How Pacemaker reacts to fast changes of the same parameter in configuration

2016-11-09 Thread Klaus Wenninger
o stop anything anymore. So you can save the time the starts would take. Unfortunately you have to repeat that and thus put additional load on pacemaker possibly slowing down things if your poll-cycle is to short. > > > Thank you, > Kostia > > On Tue, Nov 8, 2016 at 10

Re: [ClusterLabs] How Pacemaker reacts to fast changes of the same parameter in configuration

2016-11-08 Thread Klaus Wenninger
On 11/08/2016 11:40 AM, Kostiantyn Ponomarenko wrote: > Hi, > > I need a way to do a manual fail-back on demand. > To be clear, I don't want it to be ON/OFF; I want it to be more like > "one shot". > So far I found that the most reasonable way to do it - is to set > "resource stickiness" to a

Re: [ClusterLabs] Antw: Re: How Pacemaker reacts to fast changes of the same parameter in configuration

2016-11-10 Thread Klaus Wenninger
On 11/10/2016 08:27 AM, Ulrich Windl wrote: >>>> Klaus Wenninger <kwenn...@redhat.com> schrieb am 09.11.2016 um 17:42 in > Nachricht <80c65564-b299-e504-4c6c-afd0ff86e...@redhat.com>: >> On 11/09/2016 05:30 PM, Kostiantyn Ponomarenko wrote: >>> When o

Re: [ClusterLabs] pacemaker after upgrade from wheezy to jessie

2016-11-10 Thread Klaus Wenninger
On 11/10/2016 09:47 AM, Toni Tschampke wrote: >> Did your upgrade documentation describe how to update the corosync >> configuration, and did that go well? crmd may be unable to function due >> to lack of quorum information. > > Thanks for this tip, corosync quorum configuration was the cause. > >

Re: [ClusterLabs] Antw: Re: How Pacemaker reacts to fast changes of the same parameter in configuration

2016-11-10 Thread Klaus Wenninger
I have full control over it all > the time. > > Klaus Wenninger, > > You are right. That is exactly what I want and what I am concerned > about. Another example with "move" operation is 100% correct. > > I've been thinking about another possible approach here since > y

Re: [ClusterLabs] Antw: Pacemaker 1.1.16 - Release Candidate 1

2016-11-07 Thread Klaus Wenninger
On 11/07/2016 10:26 AM, Jehan-Guillaume de Rorthais wrote: > On Mon, 7 Nov 2016 10:12:04 +0100 > Klaus Wenninger <kwenn...@redhat.com> wrote: > >> On 11/07/2016 08:41 AM, Ulrich Windl wrote: >>>>>> Ken Gaillot <kgail...@redhat.com> schrieb am 04.11.2016

Re: [ClusterLabs] Antw: Pacemaker 1.1.16 - Release Candidate 1

2016-11-07 Thread Klaus Wenninger
On 11/07/2016 08:41 AM, Ulrich Windl wrote: Ken Gaillot schrieb am 04.11.2016 um 22:37 in Nachricht > <27c2ca20-c52c-8fb4-a60f-5ae12f7ff...@redhat.com>: >> On 11/04/2016 02:29 AM, Ulrich Windl wrote: >> Ken Gaillot schrieb am 03.11.2016 um

Re: [ClusterLabs] Live migration not working on shutdown

2016-11-03 Thread Klaus Wenninger
On 11/02/2016 06:32 PM, Ken Gaillot wrote: > On 10/26/2016 06:12 AM, Rainer Nerb wrote: >> Hello all, >> >> we're currently testing a 2-node-cluster with 2 vms and live migration >> on CentOS 7.2 and Pacemaker 1.1.13-10 with disks on iSCSI-targets and >> migration via ssh-method. >> >> Live

Re: [ClusterLabs] Pacemaker 1.1.16 - Release Candidate 1

2016-11-03 Thread Klaus Wenninger
On 11/03/2016 07:13 PM, Adam Spiers wrote: > Klaus Wenninger <kwenn...@redhat.com> wrote: >> On 11/03/2016 05:28 PM, Adam Spiers wrote: >>> Ken Gaillot <kgail...@redhat.com> wrote: >>>> ClusterLabs is happy to announce the first release candidate for

Re: [ClusterLabs] I've been working on a split-brain prevention strategy for 2-node clusters.

2016-10-10 Thread Klaus Wenninger
On 10/10/2016 04:25 PM, Ken Gaillot wrote: > On 10/09/2016 11:02 PM, Digimer wrote: >> On 09/10/16 11:58 PM, Andrei Borzenkov wrote: >>> 10.10.2016 00:42, Eric Robinson пишет: Digimer, thanks for your thoughts. Booth is one of the solutions I looked at, but I don't like it because it is

Re: [ClusterLabs] setting up SBD_WATCHDOG_TIMEOUT, stonith-timeout and stonith-watchdog-timeout

2016-12-14 Thread Klaus Wenninger
On 12/14/2016 01:26 PM, Jehan-Guillaume de Rorthais wrote: > On Thu, 8 Dec 2016 11:47:20 +0100 > Jehan-Guillaume de Rorthais wrote: > >> Hello, >> >> While setting this various parameters, I couldn't find documentation and >> details about them. Bellow some questions. >> >>

Re: [ClusterLabs] New ClusterLabs logo unveiled :-)

2017-01-11 Thread Klaus Wenninger
On 01/11/2017 10:54 AM, Kostiantyn Ponomarenko wrote: > Nice logo! > > http://wiki.clusterlabgs.org/ doesn't load for me. Obviously a typo further down in the thread http://wiki.clusterlabs.org/ does work Regards, Klaus > > I also have a question which bothers me

Re: [ClusterLabs] Corosync/pacemaker seeing monitored process as FAILED

2017-01-01 Thread Klaus Wenninger
Hi Suresh! Have you tried lsb-status in a shell? Does it show anything interesting or is it hanging? Regards, Klaus On 12/30/2016 08:45 AM, Suresh Rajagopalan wrote: > Cluster running centos 6.8 with pacemaker/corosync.This config was > running well for quite sometime. All of a sudden we

Re: [ClusterLabs] Antw: Re: Antw: sbd: Cannot open watchdog device: /dev/watchdog

2017-01-04 Thread Klaus Wenninger
On 01/04/2017 02:23 PM, Muhammad Sharfuddin wrote: > On 01/04/2017 06:05 PM, Ulrich Windl wrote: > Muhammad Sharfuddin schrieb am > 04.01.2017 um 11:58 in >> Nachricht <9ff82caa-d16e-13f4-e514-d356224f8...@nds.com.pk>: >>> On 01/04/2017 12:09 PM, Ulrich Windl

  1   2   3   4   5   6   >