Re: [ClusterLabs] Moving Related Servers

2016-04-18 Thread Ken Gaillot
On 04/18/2016 02:34 AM, ‪H Yavari‬ ‪ wrote: > Hi, > > I have 4 CentOS servers (App1,App2.App3 and App4). I created a cluster > for App1 and App2 with a IP float and it works well. > In our infrastructure App1 works only with App3 and App2 only works with > App4. I mean we have 2 server sets (App1

Re: [ClusterLabs] pacemaker apache and umask on CentOS 7

2016-04-20 Thread Ken Gaillot
On 04/20/2016 09:11 AM, fatcha...@gmx.de wrote: > Hi, > > I´m running a 2-node apache webcluster on a fully patched CentOS 7 > (pacemaker-1.1.13-10.el7_2.2.x86_64 pcs-0.9.143-15.el7.x86_64). > Some files which are generated by the apache are created with a umask 137 but > I need this files

Re: [ClusterLabs] Q: Resource balancing opration

2016-04-20 Thread Ken Gaillot
On 04/20/2016 01:17 AM, Ulrich Windl wrote: > Hi! > > I'm wondering: If you boot a node on a cluster, most resources will go to > another node (if possible). Due to stickiness configured, those resources > will stay there. > So I'm wondering whether or how I could cause a rebalance of resources

Re: [ClusterLabs] Moving Related Servers

2016-04-20 Thread Ken Gaillot
617356537136 > > > *From:* Klaus Wenninger <kwenn...@redhat.com> > *To:* users@clusterlabs.org > *Sent:* Wednesday, 20 April 2016, 9:56:05 > *Subject:* Re: [ClusterLabs] Moving Related Servers > > On 04/19/2016 04:32 PM, Ken Gaillot wrote: >> On

Re: [ClusterLabs] pacemaker apache and umask on CentOS 7

2016-04-20 Thread Ken Gaillot
On 04/20/2016 12:20 PM, Klaus Wenninger wrote: > On 04/20/2016 05:35 PM, fatcha...@gmx.de wrote: >> >>> Gesendet: Mittwoch, 20. April 2016 um 16:31 Uhr >>> Von: "Klaus Wenninger" >>> An: users@clusterlabs.org >>> Betreff: Re: [ClusterLabs] pacemaker apache and umask on CentOS

Re: [ClusterLabs] Antw: Re: Q: Resource balancing opration

2016-04-21 Thread Ken Gaillot
On 04/21/2016 01:56 AM, Ulrich Windl wrote: >>>> Ken Gaillot <kgail...@redhat.com> schrieb am 20.04.2016 um 16:44 in >>>> Nachricht >> You can also use rules to make the process intelligent. For example, for >> a server that provides office services,

Re: [ClusterLabs] service flap as nodes join and leave

2016-04-14 Thread Ken Gaillot
On 04/14/2016 09:33 AM, Christopher Harvey wrote: > MsgBB-Active is a dummy resource that simply returns OCF_SUCCESS on > every operation and logs to a file. That's a common mistake, and will confuse the cluster. The cluster checks the status of resources both where they're supposed to be running

Re: [ClusterLabs] Antw: Re: Antw: Doing reload right

2016-07-14 Thread Ken Gaillot
On 07/13/2016 11:20 PM, Andrew Beekhof wrote: > On Wed, Jul 6, 2016 at 12:57 AM, Ken Gaillot <kgail...@redhat.com> wrote: >> On 07/04/2016 02:01 AM, Ulrich Windl wrote: >>> For the case of changing the contents of an external configuration file, the >>> RA woul

Re: [ClusterLabs] Pacemaker in puppet with cib.xml?

2016-07-21 Thread Ken Gaillot
On 07/21/2016 01:35 PM, Stephano-Shachter, Dylan wrote: > Hello all, > > I want to put the pacemaker config for my two node cluster in puppet > but, since it is just one cluster, it seems overkill to use the corosync > module. If I just have puppet push cib.xml to each machine, will that > work?

Re: [ClusterLabs] Pacemaker in puppet with cib.xml?

2016-07-21 Thread Ken Gaillot
. If I wanted to make any big changes, I can make > them with pcs and just pull another config. Sounds good. > On Thu, Jul 21, 2016 at 2:52 PM, Ken Gaillot <kgail...@redhat.com > <mailto:kgail...@redhat.com>> wrote: > > On 07/21/2016 01:35 PM, Stephano-Shachter,

Re: [ClusterLabs] Previous DC fenced prior to integration

2016-07-29 Thread Ken Gaillot
On 07/28/2016 01:48 PM, Nate Clark wrote: > On Mon, Jul 25, 2016 at 2:48 PM, Nate Clark <n...@neworld.us> wrote: >> On Mon, Jul 25, 2016 at 11:20 AM, Ken Gaillot <kgail...@redhat.com> wrote: >>> On 07/23/2016 10:14 PM, Nate Clark wrote: >>>> On Sat,

Re: [ClusterLabs] Bloody Newbie needs help for OCFS2 on pacemaker+corosync+pcs

2016-08-02 Thread Ken Gaillot
On 08/02/2016 08:16 AM, t...@it-hluchnik.de wrote: > Hello Kyle + all, > > No luck at all. Cant get o2cb up at all. Please find details below. > Thanks in advance for any help. > > First I tried to translate your crm syntax to pcs syntax: > > primitive p_o2cb lsb:o2cb \ op monitor interval="10"

Re: [ClusterLabs] Antw: Coming in 1.1.16: versioned resource parameters

2016-08-11 Thread Ken Gaillot
On 08/11/2016 03:35 AM, Klaus Wenninger wrote: > On 08/11/2016 09:13 AM, Ulrich Windl wrote: >>>>> Ken Gaillot <kgail...@redhat.com> schrieb am 10.08.2016 um 22:36 in >>>>> Nachricht >> <804dd911-56a6-328c-00a4-43133f59d...@redhat.com>: >&

[ClusterLabs] Coming in 1.1.16: versioned resource parameters

2016-08-10 Thread Ken Gaillot
ntov, who created this feature as part of a student project with EMC under the supervision of Victoria Cherkalova. -- Ken Gaillot <kgail...@redhat.com> ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Proj

Re: [ClusterLabs] changing pacemaker.log location

2016-08-12 Thread Ken Gaillot
On 08/12/2016 10:19 AM, Christopher Harvey wrote: > I'm surprised I'm having such a hard time figuring this out on my own. > I'm running pacemaker 1.1.13 and corosync-2.3.4 and want to change the > location of pacemaker.log. > > By default it is located in /var/log. > > I looked in corosync.c

Re: [ClusterLabs] Doing reload right

2016-07-20 Thread Ken Gaillot
On 07/20/2016 11:47 AM, Adam Spiers wrote: > Ken Gaillot <kgail...@redhat.com> wrote: >> Hello all, >> >> I've been meaning to address the implementation of "reload" in Pacemaker >> for a while now, and I think the next release will be a good time, as

Re: [ClusterLabs] can't start/stop a drbd resource with pacemaker

2016-07-18 Thread Ken Gaillot
On 07/15/2016 07:08 PM, Lentes, Bernd wrote: > > > - Am 15. Jul 2016 um 23:48 schrieb Ken Gaillot kgail...@redhat.com: > >> On 07/15/2016 03:54 PM, Lentes, Bernd wrote: >>> >>> >>> - Am 13. Jul 2016 um 14:25 schrie

Re: [ClusterLabs] Pacemaker not always selecting the right stonith device

2016-07-19 Thread Ken Gaillot
On 07/18/2016 05:51 PM, Martin Schlegel wrote: > Hello all > > I cannot wrap my brain around what's going on here ... any help would prevent > me > from fencing my brain =:-D > > > > Problem: > > When completely network isolating a node, i.e. pg1 - sometimes a different > node > gets

Re: [ClusterLabs] pg cluster secondary not syncing after failover

2016-07-15 Thread Ken Gaillot
On 07/15/2016 08:58 AM, Peter Brunnengräber wrote: > Hello all, > My apologies for cross-posting this from the postgresql admins list. I am > beginning to think this may have more to do with the postgresql cluster > script. > > I'm having an issue with a postgresql 9.2 cluster after

Re: [ClusterLabs] Clusvcadm -Z substitute in Pacemaker

2016-07-13 Thread Ken Gaillot
On 07/13/2016 05:50 AM, emmanuel segura wrote: > using pcs resource unmanage leave the monitoring resource actived, I > usually set the monitor interval=0 :) Yep :) An easier way is to set "enabled=false" on the monitor, so you don't have to remember what your interval was later. You can set it

Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: Re: Antw: RES: Pacemaker and OCFS2 on stand alone mode

2016-07-13 Thread Ken Gaillot
On 07/13/2016 03:10 AM, Ulrich Windl wrote: >>>> Ken Gaillot <kgail...@redhat.com> schrieb am 12.07.2016 um 21:19 in >>>> Nachricht > <578542bf.9010...@redhat.com>: >> On 07/12/2016 01:16 AM, Ulrich Windl wrote: > > [...] >>&

Re: [ClusterLabs] Preventing pacemaker from attempting to start a VirtualDomain resource on a pacemaker-remote guest node

2016-07-12 Thread Ken Gaillot
On 07/12/2016 12:46 PM, Scott Loveland wrote: > What is the most efficient way to prevent pacemaker from attempting to > start a VirtualDomain resource on pacemaker-remote guest nodes? > > I’m running pacemaker 1.1.13 in a KVM host cluster with a large number > of VirtualDomain (VD) resources

Re: [ClusterLabs] Clusvcadm -Z substitute in Pacemaker

2016-07-13 Thread Ken Gaillot
On 07/13/2016 09:56 AM, emmanuel segura wrote: > enabled=false works with every pacemaker versions? It was introduced in Pacemaker 1.0.2, so realistically, yes :) > 2016-07-13 16:48 GMT+02:00 Ken Gaillot <kgail...@redhat.com>: >> On 07/13/2016 05:50 AM, emmanuel segura wr

Re: [ClusterLabs] Default Behavior

2016-06-28 Thread Ken Gaillot
On 06/28/2016 10:53 AM, Pavlov, Vladimir wrote: > Hello! > > We have Pacemaker cluster of two node Active/Backup (OS Centos 6.7), > with resources IPaddr2 and ldirectord. > > Cluster Properties: > > cluster-infrastructure: cman > > dc-version: 1.1.11-97629de > > no-quorum-policy: ignore > >

Re: [ClusterLabs] How about making pacemaker OCF_ROOT_DIR more portable ?

2016-07-05 Thread Ken Gaillot
On 07/04/2016 10:07 PM, Li Junliang wrote: > Hi all, > Currently, OCF_ROOT_DIR in configure.ac for pacemaker is fixed > to /usr/lib/ocf. But in resource-agents project , we can set > OCF_ROOT_DIR by setting "--with-ocf-root=xxx" . So , there may be two > different OCF directories . Would it

Re: [ClusterLabs] DC of the election will be an infinite loop.

2016-07-07 Thread Ken Gaillot
On 06/27/2016 11:41 PM, 飯田 雄介 wrote: > Hi, all > > I added two lines to comment on cib.xml using crmsh. > > === > # cat test.crm > node 167772452: test-3 > # comment line 1 > # comment line 2 > property cib-bootstrap-options: \ > have-watchdog=false \ >

Re: [ClusterLabs] system or ocf resource agent

2016-07-08 Thread Ken Gaillot
On 07/08/2016 05:10 AM, Heiko Reimer wrote: > Hi, > > i am setting up new debian 8 ha cluster with drbd, corosync and > pacemaker with apache and mysql. In my old environment i had configured > resources with ocf resource agents. Now i have seen that there is > systemd. Which agent would you

Re: [ClusterLabs] Proposed change for 1.1.16: ending python 2.6 compatibility

2016-07-06 Thread Ken Gaillot
On 07/05/2016 06:49 PM, Digimer wrote: > On 05/07/16 01:31 PM, Ken Gaillot wrote: >> As you may be aware, python 3 is a significant, backward-compatible >> restructuring of the python language. Most development of the python 2 >> series has ended, and support for python

Re: [ClusterLabs] Antw: Proposed change for 1.1.16: ending python 2.6 compatibility

2016-07-06 Thread Ken Gaillot
On 07/06/2016 01:07 AM, Ulrich Windl wrote: >>>> Ken Gaillot <kgail...@redhat.com> schrieb am 05.07.2016 um 19:31 in >>>> Nachricht > <577beef9.4000...@redhat.com>: >> As you may be aware, python 3 is a significant, backward-compatible >> res

Re: [ClusterLabs] Doing reload right

2016-07-08 Thread Ken Gaillot
On 07/04/2016 07:13 AM, Ferenc Wágner wrote: > Ken Gaillot <kgail...@redhat.com> writes: > >> Does anyone know of an RA that uses reload correctly? > > My resource agents advertise a no-op reload action for handling their > "private" meta attributes.

Re: [ClusterLabs] Antw: Doing reload right

2016-07-05 Thread Ken Gaillot
On 07/04/2016 03:52 AM, Vladislav Bogdanov wrote: > 01.07.2016 18:26, Ken Gaillot wrote: > > [...] > >> You're right, "parameters" or "params" would be more consistent with >> existing usage. "Instance attributes" is probably the most tec

Re: [ClusterLabs] Antw: Re: Antw: Doing reload right

2016-07-05 Thread Ken Gaillot
On 07/04/2016 02:01 AM, Ulrich Windl wrote: > For the case of changing the contents of an external configuration file, the > RA would have to provide some reloadable dummy parameter then (maybe like > "config_generation=2"). That is a widely recommended approach for the current "reload"

[ClusterLabs] Proposed change for 1.1.16: ending python 2.6 compatibility

2016-07-05 Thread Ken Gaillot
onses than "slow down" ;-) -- Ken Gaillot <kgail...@redhat.com> ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http

Re: [ClusterLabs] Minimal metadata for fencing agent

2016-08-05 Thread Ken Gaillot
On 08/05/2016 06:16 AM, Maciej Kopczyński wrote: > Thanks for your answer Thomas and sorry for messing up the layout of > messages - I was trying to write from a mobile phone using gmail... I > was able to put something up using what I found on the web and my own > writing. My agent seems to do

Re: [ClusterLabs] Singleton resource not being migrated

2016-08-05 Thread Ken Gaillot
On 08/05/2016 03:48 AM, Andreas Kurz wrote: > Hi, > > On Fri, Aug 5, 2016 at 2:08 AM, Nikita Koshikov > wrote: > > Hello list, > > Can you, please, help me in debugging 1 resource not being started > after node failover ? > >

Re: [ClusterLabs] corosync init script is not a LSB compliance ?

2016-06-30 Thread Ken Gaillot
On 06/30/2016 09:18 AM, Jan Friesse wrote: > Hi Li, > >> Hi all, >> I compiled latest version of corosync and found its init script >> is not LSB compliance. When I check stopped corosync service status >> with cmd "service corosync status" (on centos 6 without systemd), it >> returns "1"

[ClusterLabs] Doing reload right

2016-06-30 Thread Ken Gaillot
ard compatibility with both UI usage of unique and reload usage of unique. Plus, the worst that would happen is that the RA would stop being reloadable -- not as bad as the current possibilities from mis-implemented reload. My questions are: Does anyone know of an RA that uses reload correc

Re: [ClusterLabs] Sticky resource not sticky after unplugging network cable

2016-07-01 Thread Ken Gaillot
On 07/01/2016 02:13 AM, Auer, Jens wrote: > Hi, > > I have an active/passive cluster configuration and I am trying to make a > virtual IP resource sticky such that it does not move back to a node > after a fail-over. In my setup, I have a location preference for the > virtual IP to the "primary"

Re: [ClusterLabs] Master-Slaver resource Restarted after configuration change

2016-06-29 Thread Ken Gaillot
On 06/29/2016 01:35 PM, Ilia Sokolinski wrote: > >> >> I'm not sure there's a way to do this. >> >> If a (non-reloadable) parameter changes, the entire clone does need a >> restart, so the cluster will want all instances to be stopped, before >> proceeding to start them all again. >> >> Your

Re: [ClusterLabs] Default Behavior

2016-06-29 Thread Ken Gaillot
>> dc-version: 1.1.14-8.el6-70404b0 >> have-watchdog: false >> no-quorum-policy: ignore >> stonith-enabled: false >> Changed the behavior of the cluster in the new version or accident is not >> fully emulated? >> Thank you. >> >> >> Kind regards

Re: [ClusterLabs] error: crm_timer_popped: Shutdown Escalation (I_STOP) just popped in state S_POLICY_ENGINE

2016-06-29 Thread Ken Gaillot
On 06/29/2016 09:38 AM, Kostiantyn Ponomarenko wrote: > Hello, > > I am seeing those error messages in the syslog when the machine goes > down (one-node cluster): For anyone who missed the IRC discussion: This is probably the issue fixed by commit 6aae854 in the just-released Pacemaker 1.1.15.

Re: [ClusterLabs] Antw: Doing reload right

2016-07-01 Thread Ken Gaillot
On 07/01/2016 04:48 AM, Jan Pokorný wrote: > On 01/07/16 09:23 +0200, Ulrich Windl wrote: >>>>> Ken Gaillot <kgail...@redhat.com> schrieb am 30.06.2016 um 18:58 in >>>>> Nachricht >> <57754f9f.8070...@redhat.com>: >>> I've been me

Re: [ClusterLabs] Singleton resource not being migrated

2016-08-16 Thread Ken Gaillot
On 08/05/2016 05:12 PM, Nikita Koshikov wrote: > Thanks, Ken, > > On Fri, Aug 5, 2016 at 7:21 AM, Ken Gaillot <kgail...@redhat.com > <mailto:kgail...@redhat.com>> wrote: > > On 08/05/2016 03:48 AM, Andreas Kurz wrote: > > Hi, > > >

Re: [ClusterLabs] Failover When Host is Up, Out of Order Logs

2017-02-01 Thread Ken Gaillot
On 01/31/2017 11:44 AM, Corey Moullas wrote: > I have been getting extremely strange behavior from a Corosync/Pacemaker > install on OVH Public Cloud servers. > > > > After hours of Googling, I thought I would try posting here to see if > somebody knows what to do. > > > > I see this in my

Re: [ClusterLabs] [Question] About a change of crm_failcount.

2017-02-02 Thread Ken Gaillot
On 02/02/2017 12:23 PM, renayama19661...@ybb.ne.jp wrote: > Hi All, > > By the next correction, the user was not able to set a value except zero in > crm_failcount. > > - [Fix: tools: implement crm_failcount command-line options correctly] >- >

Re: [ClusterLabs] A stop job is running for pacemaker high availability cluster manager

2017-02-02 Thread Ken Gaillot
g/messages sometimes has relevant messages from non-cluster components. You'd want to look for messages like "Caught 'Terminated' signal" and "Shutting down", as well as resources being stopped ("_stop_0"), then various "Disconnect" and "Stopping" mess

Re: [ClusterLabs] Live Guest Migration timeouts for VirtualDomain resources

2017-02-01 Thread Ken Gaillot
>> Where does that original op name come from in the VirtualDomain resource >> definition? How can we get the initial meta value changed and shipped > with >> a valid operation name (i.e. migrate_to), and >> maybe a more reasonable migrate_to timeout value... something >

Re: [ClusterLabs] Antw: Resource Priority

2017-02-01 Thread Ken Gaillot
On 02/01/2017 09:07 AM, Ulrich Windl wrote: Chad Cravens schrieb am 01.02.2017 um 15:52 in > Nachricht > : >> Hello Cluster Fans! >> >> I've had a great time working with the clustering software.

Re: [ClusterLabs] Using "mandatory" startup order but avoiding depending clones from restart after member of parent clone fails

2017-02-08 Thread Ken Gaillot
On 02/06/2017 05:25 PM, Alejandro Comisario wrote: > guys, really happy to post my first doubt. > > i'm kinda having an "conceptual" issue that's bringing me, lots of issues > i need to ensure that order of starting resources are mandatory but > that is causing me a huge issue, that is if just

Re: [ClusterLabs] Antw: Re: Pacemaker kill does not cause node fault ???

2017-02-06 Thread Ken Gaillot
On 02/06/2017 03:28 AM, Ulrich Windl wrote: >>>> RaSca <ra...@miamammausalinux.org> schrieb am 03.02.2017 um 14:00 in > Nachricht > <0de64981-904f-5bdb-c98f-9c59ee47b...@miamammausalinux.org>: > >> On 03/02/2017 11:06, Ferenc Wágner wrote: >&

Re: [ClusterLabs] Failure to configure iface-bridge resource causes cluster node fence action.

2017-02-06 Thread Ken Gaillot
, since it failed the resource and fenced the > node instead of disabling the resource. > Just checking with you to be sure. > > Thanks again.. > > Scott Greenlese ... IBM KVM on System Z Solutions Test, Poughkeepsie, N.Y. > INTERNET: swgre...@us.ibm.com > > > > Inactive h

Re: [ClusterLabs] [Question] About a change of crm_failcount.

2017-02-03 Thread Ken Gaillot
On 02/02/2017 12:33 PM, Ken Gaillot wrote: > On 02/02/2017 12:23 PM, renayama19661...@ybb.ne.jp wrote: >> Hi All, >> >> By the next correction, the user was not able to set a value except zero in >> crm_failcount. >> >> - [Fix: tools: implement crm_f

Re: [ClusterLabs] [Question] About log collection of crm_report.

2017-01-23 Thread Ken Gaillot
On 01/23/2017 04:17 PM, renayama19661...@ybb.ne.jp wrote: > Hi All, > > When I carry out Pacemaker1.1.15 and Pacemaker1.1.16 in RHEL7.3, log in > conjunction with pacemaker is not collected in the file which I collected in > sosreport. > > > This seems to be caused by the next correction and

Re: [ClusterLabs] Pacemaker kill does not cause node fault ???

2017-01-30 Thread Ken Gaillot
On 01/10/2017 04:24 AM, Stefan Schloesser wrote: > Hi, > > I am currently testing a 2 node cluster under Ubuntu 16.04. The setup seems > to be working ok including the STONITH. > For test purposes I issued a "pkill -f pace" killing all pacemaker processes > on one node. > > Result: > The node

Re: [ClusterLabs] [Question] About log collection of crm_report.

2017-01-25 Thread Ken Gaillot
osreport contents? > If it is such a thing, we can understand. > > - And I test crm_report at the present, but seem to have some problems. > - I intend to report the problem by Bugzilla again. > > Best Regards, > Hideo Yamauchi. Hi Hideo, You are right, that is a problem. I've

Re: [ClusterLabs] Need help in setting up HA cluster for applications/services other than Apache tomcat.

2017-02-20 Thread Ken Gaillot
On 02/18/2017 10:55 AM, Chad Cravens wrote: > Hello Vijay: > > it seems you may want to consider developing custom Resource Agents. > Take a look at the following guide: > http://www.linux-ha.org/doc/dev-guides/ra-dev-guide.html > > I have created several, it is pretty straightforward and has

Re: [ClusterLabs] Never join a list without a problem...

2017-02-24 Thread Ken Gaillot
On 02/24/2017 08:36 AM, Jeffrey Westgate wrote: > Greetings all. > > I have inherited a pair of Scientific Linux 6 boxes used as front-end load > balancers for our DNS cluster. (Yes, I inherited that, too.) > > It was time to update them so we pulled snapshots (they are VMWare VMs, very >

Re: [ClusterLabs] question about equal resource distribution

2017-02-17 Thread Ken Gaillot
On 02/17/2017 08:43 AM, Ilia Sokolinski wrote: > Thank you! > > What quantity does pacemaker tries to equalize - number of running resources > per node or total stickiness per node? > > Suppose I have a bunch of web server groups each with IPaddr and apache > resources, and a fewer number of

Re: [ClusterLabs] Antw: Re: Antw: Re: Pacemaker kill does not cause node fault ???

2017-02-13 Thread Ken Gaillot
On 02/08/2017 02:49 AM, Ferenc Wágner wrote: > Ken Gaillot <kgail...@redhat.com> writes: > >> On 02/07/2017 01:11 AM, Ulrich Windl wrote: >> >>> Ken Gaillot <kgail...@redhat.com> writes: >>> >>>> On 02/06/2017 03:28 AM, Ulrich W

Re: [ClusterLabs] two node cluster: vm starting - shutting down 15min later - starting again 15min later ... and so on

2017-02-09 Thread Ken Gaillot
On 02/09/2017 10:48 AM, Lentes, Bernd wrote: > Hi, > > i have a two node cluster with a vm as a resource. Currently i'm just testing > and playing. My vm boots and shuts down again in 15min gaps. > Surely this is related to "PEngine Recheck Timer (I_PE_CALC) just popped > (90ms)" found in

Re: [ClusterLabs] Q (SLES11 SP4): Delay after node came up (info: throttle_send_command: New throttle mode: 0000 (was ffffffff))

2017-02-09 Thread Ken Gaillot
On 01/16/2017 04:25 AM, Ulrich Windl wrote: > Hi! > > I have a question: The following happened in out 3-node cluster (n1, n2, n3): > n3 was DC, n2 was offlined, n2 came online again, n1 rebooted (went > offline/online), then n2 reboted (offline /online) > > I observed a significant delay after

Re: [ClusterLabs] Problems with corosync and pacemaker with error scenarios

2017-02-09 Thread Ken Gaillot
On 01/16/2017 11:18 AM, Gerhard Wiesinger wrote: > Hello Ken, > > thank you for the answers. > > On 16.01.2017 16:43, Ken Gaillot wrote: >> On 01/16/2017 08:56 AM, Gerhard Wiesinger wrote: >>> Hello, >>> >>> I'm new to corosync and pacemaker and

Re: [ClusterLabs] Trouble setting up selfcompiled Apache in a pacemaker cluster on Oracle Linux 6.8

2017-02-09 Thread Ken Gaillot
On 01/16/2017 10:16 AM, Souvignier, Daniel wrote: > Hi List, > > > > I’ve got trouble getting Apache to work in a Pacemaker cluster I set up > between two Oracle Linux 6.8 hosts. The cluster itself works just fine, > but Apache won’t come up. Thing is here, this Apache is different from a >

Re: [ClusterLabs] Pacemaker cluster not working after switching from 1.0 to 1.1

2017-02-09 Thread Ken Gaillot
On 01/16/2017 01:16 PM, Rick Kint wrote: > >> Date: Mon, 16 Jan 2017 09:15:44 -0600 >> From: Ken Gaillot <kgail...@redhat.com> >> To: users@clusterlabs.org >> Subject: Re: [ClusterLabs] Pacemaker cluster not working after >> switching from 1.0 to 1

Re: [ClusterLabs] two node cluster: vm starting - shutting down 15min later - starting again 15min later ... and so on

2017-02-10 Thread Ken Gaillot
On 02/10/2017 06:49 AM, Lentes, Bernd wrote: > > > - On Feb 10, 2017, at 1:10 AM, Ken Gaillot kgail...@redhat.com wrote: > >> On 02/09/2017 10:48 AM, Lentes, Bernd wrote: >>> Hi, >>> >>> i have a two node cluster with a vm as a resource. Curren

Re: [ClusterLabs] 答复: Re: clone resource not get restarted on fail

2017-02-14 Thread Ken Gaillot
On 02/13/2017 07:08 PM, he.hailo...@zte.com.cn wrote: > Hi, > > > > crm configure show > > + crm configure show > > node $id="336855579" paas-controller-1 > > node $id="336855580" paas-controller-2 > > node $id="336855581" paas-controller-3 > > primitive apigateway ocf:heartbeat:apigateway

Re: [ClusterLabs] clone resource not get restarted on fail

2017-02-13 Thread Ken Gaillot
On 02/13/2017 07:57 AM, he.hailo...@zte.com.cn wrote: > Pacemaker 1.1.10 > > Corosync 2.3.3 > > > this is a 3 nodes cluster configured with 3 clone resources, each > attached wih a vip resource of IPAddr2: > > > >crm status > > > Online: [ paas-controller-1 paas-controller-2

Re: [ClusterLabs] Antw: Re: [Question] About a change of crm_failcount.

2017-02-09 Thread Ken Gaillot
to specify the error code, but in this case, I think crm_resource -B (or the private attribute approach, if you're OK with limiting it to corosync 2 and pacemaker 1.1.13+) is better. >> - Original Message - >>> From: Ulrich Windl <ulrich.wi...@rz.uni-regensburg.de> >&

Re: [ClusterLabs] Failed reload

2017-02-09 Thread Ken Gaillot
On 02/08/2017 02:15 AM, Ferenc Wágner wrote: > Hi, > > There was an interesting discussion on this list about "Doing reload > right" last July (which I still haven't digested entirely). Now I've > got a related question about the current and intented behavior: what > happens if a reload

Re: [ClusterLabs] Using "mandatory" startup order but avoiding depending clones from restart after member of parent clone fails

2017-02-09 Thread Ken Gaillot
e. Of course, you can clean up the failure to start over (or set a failure-timeout to do that automatically). > On Thu, Feb 9, 2017 at 12:18 AM, Ken Gaillot <kgail...@redhat.com > <mailto:kgail...@redhat.com>> wrote: > > On 02/06/2017 05:25 PM, Alejandro Comisario wrote: >

Re: [ClusterLabs] MySQL Cluster: Strange behaviour when forcing movement of resources

2017-02-16 Thread Ken Gaillot
On 02/16/2017 02:26 AM, Félix Díaz de Rada wrote: > > Hi all, > > We are currently setting up a MySQL cluster (Master-Slave) over this > platform: > - Two nodes, on RHEL 7.0 > - pacemaker-1.1.10-29.el7.x86_64 > - corosync-2.3.3-2.el7.x86_64 > - pcs-0.9.115-32.el7.x86_64 > There is a IP address

Re: [ClusterLabs] I question whether STONITH is working.

2017-02-15 Thread Ken Gaillot
On 02/15/2017 12:17 PM, dur...@mgtsciences.com wrote: > I have 2 Fedora VMs (node1, and node2) running on a Windows 10 machine > using Virtualbox. > > I began with this. > http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Clusters_from_Scratch/ > > > When it came to fencing, I

Re: [ClusterLabs] Antw: Re: Live Guest Migration timeouts for VirtualDomain resources

2017-01-19 Thread Ken Gaillot
On 01/19/2017 01:36 AM, Ulrich Windl wrote: >>>> Ken Gaillot <kgail...@redhat.com> schrieb am 18.01.2017 um 16:32 in >>>> Nachricht > <4b02d3fa-4693-473b-8bed-dc98f9e3f...@redhat.com>: >> On 01/17/2017 04:45 PM, Scott Greenlese wrote: >>>

Re: [ClusterLabs] how do I disable/negate resource option?

2017-01-19 Thread Ken Gaillot
On 01/19/2017 06:30 AM, lejeczek wrote: > hi all > > how can it be done? Is it possible? > many thanks, > L. Check the man page / documentation for whatever tool you're using (crm, pcs, etc.). Each one has its own syntax. ___ Users mailing list:

Re: [ClusterLabs] Antw: Re: VirtualDomain started in two hosts

2017-01-17 Thread Ken Gaillot
On 01/17/2017 08:52 AM, Ulrich Windl wrote: Oscar Segarra schrieb am 17.01.2017 um 10:15 in > Nachricht > : >> Hi, >> >> Yes, I will try to explain myself better. >> >> *Initially* >> On node1

Re: [ClusterLabs] Antw: Re: VirtualDomain started in two hosts

2017-01-17 Thread Ken Gaillot
o it will detect anything running at that time, and start or stop services as needed to meet the configured requirements. > 2017-01-17 16:38 GMT+01:00 Ken Gaillot <kgail...@redhat.com > <mailto:kgail...@redhat.com>>: > > On 01/17/2017 08:52 AM, Ulrich Windl wrote: >

Re: [ClusterLabs] Pacemaker cluster not working after switching from 1.0 to 1.1 (resend as plain text)

2017-01-16 Thread Ken Gaillot
A preliminary question -- what cluster layer are you running? Pacemaker 1.0 worked with heartbeat or corosync 1, while Ubuntu 14.04 ships with corosync 2 by default, IIRC. There were major incompatible changes between corosync 1 and 2, so it's important to get that right before looking at

Re: [ClusterLabs] Problems with corosync and pacemaker with error scenarios

2017-01-16 Thread Ken Gaillot
On 01/16/2017 08:56 AM, Gerhard Wiesinger wrote: > Hello, > > I'm new to corosync and pacemaker and I want to setup a nginx cluster > with quorum. > > Requirements: > - 3 Linux maschines > - On 2 maschines floating IP should be handled and nginx as a load > balancing proxy > - 3rd maschine is

Re: [ClusterLabs] Live Guest Migration timeouts for VirtualDomain resources

2017-01-18 Thread Ken Gaillot
) trace: > lrmd_rsc_execute: Nothing further to do for zs95kjg110061_res > Jan 17 13:55:14 [27555] zs95kj crmd: ( utils.c:1942 ) debug: > create_operation_update: do_update_resource: Updating resource > zs95kjg110061_res after migrate_to op Timed Out (interval=0) > Jan 17 13:55:14 [27555]

Re: [ClusterLabs] Antw: Re: VirtualDomain started in two hosts

2017-01-18 Thread Ken Gaillot
On 01/18/2017 03:49 AM, Ferenc Wágner wrote: > Ken Gaillot <kgail...@redhat.com> writes: > >> * When you move the VM, the cluster detects that it is not running on >> the node you told it to keep it running on. Because there is no >> "Stopped" monitor,

Re: [ClusterLabs] how do I disable/negate resource option?

2017-01-20 Thread Ken Gaillot
On 01/19/2017 09:52 AM, lejeczek wrote: > > > On 19/01/17 15:30, Ken Gaillot wrote: >> On 01/19/2017 06:30 AM, lejeczek wrote: >>> hi all >>> >>> how can it be done? Is it possible? >>> many thanks, >>> L. >> Check the man pag

Re: [ClusterLabs] Mysql slave did not start replication after failure, and read-only IP also remained active on the much outdated slave

2016-08-22 Thread Ken Gaillot
On 08/22/2016 07:24 AM, Attila Megyeri wrote: > Hi Andrei, > > I waited several hours, and nothing happened. And actually, we can see from the configuration you provided that cluster-recheck-interval is 2 minutes. I don't see anything about stonith; is it enabled and tested? This looks like a

Re: [ClusterLabs] R: Re: Antw: Re: Ordering Sets of Resources

2017-03-01 Thread Ken Gaillot
>> Messaggio originale >> Da: "Ken Gaillot" <kgail...@redhat.com> >> Data: 01/03/2017 15.57 >> A: "Ulrich Windl"<ulrich.wi...@rz.uni-regensburg.de>, <users@clusterlabs.org> >> Ogg: Re: [ClusterLabs] Antw: Re: Ordering Sets o

Re: [ClusterLabs] Ordering Sets of Resources

2017-02-26 Thread Ken Gaillot
On 02/25/2017 03:35 PM, iva...@libero.it wrote: > Hi all, > i have configured a two node cluster on redhat 7. > > Because I need to manage resources stopping and starting singularly when > they are running I have configured cluster using order set constraints. > > Here the example > > Ordering

Re: [ClusterLabs] R: Re: Ordering Sets of Resources

2017-02-27 Thread Ken Gaillot
to ensure that things are stopped in order. If your goal is to start and stop them in order *if* they're both starting or stopping, but not *require* it, then you want kind=Optional instead of Mandatory. > > >> ----Messaggio originale >> Da: "Ken Gaillot"

Re: [ClusterLabs] Never join a list without a problem...

2017-02-27 Thread Ken Gaillot
On 02/27/2017 01:48 PM, Jeffrey Westgate wrote: > I think I may be on to something. It seems that every time my boxes start > showing increased host load, the preceding change that takes place is: > > crmd: info: throttle_send_command: New throttle mode: 0100 (was > ) > > I'm

Re: [ClusterLabs] corosync/pacemaker on ~100 nodes cluser

2016-08-23 Thread Ken Gaillot
On 08/23/2016 11:46 AM, Klaus Wenninger wrote: > On 08/23/2016 06:26 PM, Radoslaw Garbacz wrote: >> Hi, >> >> I would like to ask for settings (and hardware requirements) to have >> corosync/pacemaker running on about 100 nodes cluster. > Actually I had thought that 16 would be the limit for full

Re: [ClusterLabs] pacemakerd quits after few seconds with some errors

2016-08-22 Thread Ken Gaillot
On 08/22/2016 12:17 PM, Gabriele Bulfon wrote: > Hi, > > I built corosync/pacemaker for our XStreamOS/illumos : corosync starts > fine and log correctly, pacemakerd quits after some seconds with the > attached log. > Any idea where is the issue? Pacemaker is not able to communicate with corosync

Re: [ClusterLabs] fence_apc delay?

2016-09-02 Thread Ken Gaillot
On 09/02/2016 08:14 AM, Dan Swartzendruber wrote: > > So, I was testing my ZFS dual-head JBOD 2-node cluster. Manual > failovers worked just fine. I then went to try an acid-test by logging > in to node A and doing 'systemctl stop network'. Sure enough, pacemaker > told the APC fencing agent

Re: [ClusterLabs] "VirtualDomain is active on 2 nodes" due to transient network failure

2016-09-02 Thread Ken Gaillot
On 09/01/2016 09:39 AM, Scott Greenlese wrote: > Andreas, > > You wrote: > > /"Would be good to see your full cluster configuration (corosync.conf > and cib) - but first guess is: no fencing at all and what is your > "no-quorum-policy" in Pacemaker?/ > > /Regards,/ > /Andreas"/ > > Thanks

Re: [ClusterLabs] fence_apc delay?

2016-09-06 Thread Ken Gaillot
On 09/05/2016 09:38 AM, Marek Grac wrote: > Hi, > > On Mon, Sep 5, 2016 at 3:46 PM, Dan Swartzendruber > wrote: > > ... > Marek, thanks. I have tested repeatedly (8 or so times with disk > writes in progress) with 5-7 seconds and have

Re: [ClusterLabs] What cib_stats line means in logfile

2016-09-06 Thread Ken Gaillot
On 09/05/2016 03:59 PM, Jan Pokorný wrote: > On 05/09/16 21:26 +0200, Jan Pokorný wrote: >> On 25/08/16 17:55 +0200, Sébastien Emeriau wrote: >>> When i check my corosync.log i see this line : >>> >>> info: cib_stats: Processed 1 operations (1.00us average, 0% >>> utilization) in the last

Re: [ClusterLabs] fence_apc delay?

2016-09-06 Thread Ken Gaillot
On 09/06/2016 10:20 AM, Dan Swartzendruber wrote: > On 2016-09-06 10:59, Ken Gaillot wrote: > > [snip] > >> I thought power-wait was intended for this situation, where the node's >> power supply can survive a brief outage, so a delay is needed to ensure >> it drai

Re: [ClusterLabs] ip clustering strange behaviour

2016-08-31 Thread Ken Gaillot
On 08/30/2016 01:52 AM, Gabriele Bulfon wrote: > Sorry for reiterating, but my main question was: > > why does node 1 removes its own IP if I shut down node 2 abruptly? > I understand that it does not take the node 2 IP (because the > ssh-fencing has no clue about what happened on the 2nd node),

Re: [ClusterLabs] data loss of network would cause Pacemaker exit abnormally

2016-08-31 Thread Ken Gaillot
On 08/30/2016 01:58 PM, chenhj wrote: > Hi, > > This is a continuation of the email below(I did not subscrib this maillist) > > http://clusterlabs.org/pipermail/users/2016-August/003838.html > >>>From the above, I suspect that the node with the network loss was the >>DC, and from its point of

Re: [ClusterLabs] systemd RA start/stop delays

2016-08-31 Thread Ken Gaillot
On 08/30/2016 05:18 AM, Dejan Muhamedagic wrote: > Hi, > > On Thu, Aug 18, 2016 at 09:00:24AM -0500, Ken Gaillot wrote: >> On 08/17/2016 08:17 PM, TEG AMJG wrote: >>> Hi >>> >>> I am having a problem with a simple Active/Passive cluster which >>>

Re: [ClusterLabs] When does Pacemaker shoot other nodes in the head

2016-09-09 Thread Ken Gaillot
On 09/09/2016 08:52 AM, Auer, Jens wrote: > Hi, > > a client asked me to describe the conditions when Pacemaker uses STONITH > to bring the cluster into a known state. The documentation says that > this happens when "we cannot establish with certainty a state of some > node or resource", but I

Re: [ClusterLabs] Pacemaker quorum behavior

2016-09-09 Thread Ken Gaillot
On 09/09/2016 04:27 AM, Klaus Wenninger wrote: > On 09/08/2016 07:31 PM, Scott Greenlese wrote: >> >> Hi Klaus, thanks for your prompt and thoughtful feedback... >> >> Please see my answers nested below (sections entitled, "Scott's >> Reply"). Thanks! >> >> - Scott >> >> >> Scott Greenlese ... IBM

Re: [ClusterLabs] "VirtualDomain is active on 2 nodes" due to transient network failure

2016-09-09 Thread Ken Gaillot
fore an > intentional reboot. > > Thanks! > > Scott Greenlese ... IBM Solutions Test, Poughkeepsie, N.Y. > INTERNET: swgre...@us.ibm.com > PHONE: 8/293-7301 (845-433-7301) M/S: POK 42HA/P966 > > > Inactive hide details for Ken Gaillot ---09/02/2016 10:01:15 AM---From: &g

Re: [ClusterLabs] Cold star of one node only

2016-09-13 Thread Ken Gaillot
On 09/13/2016 03:27 PM, Gienek Nowacki wrote: > Hi, > > I'm still testing (before production running) the solution with > pacemaker+corosync+drbd+dlm+gfs2 on Centos7 with double-primary config. > > I have two nodes: wirt1v and wirt2v - each node contains LVM partition > with DRBD (/dev/drbd2)

<    1   2   3   4   5   6   7   8   9   10   >