Re: [ClusterLabs] Antw: Re: Notification agent and Notification recipients

2017-08-08 Thread Ken Gaillot
On Tue, 2017-08-08 at 17:40 +0530, Sriram wrote: > Hi Ulrich, > > > Please see inline. > > On Tue, Aug 8, 2017 at 2:01 PM, Ulrich Windl > wrote: > >>> Sriram schrieb am 08.08.2017 um > 09:30 in Nachricht >

Re: [ClusterLabs] Antw: Re: big trouble with a DRBD resource

2017-08-08 Thread Ken Gaillot
On Tue, 2017-08-08 at 10:18 +0200, Ulrich Windl wrote: > >>> Ken Gaillot <kgail...@redhat.com> schrieb am 07.08.2017 um 22:26 in > >>> Nachricht > <1502137587.5788.83.ca...@redhat.com>: > > [...] > > Unmanaging doesn't stop monitoring a r

Re: [ClusterLabs] big trouble with a DRBD resource

2017-08-07 Thread Ken Gaillot
displaying that differently due to the two supported rules. > Bernd > > > Helmholtz Zentrum Muenchen > Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) > Ingolstaedter Landstr. 1 > 85764 Neuherberg > www.helmholtz-muenchen.de > Aufsichtsratsvorsitzen

Re: [ClusterLabs] IPaddr2 RA and bonding

2017-08-07 Thread Ken Gaillot
_Explained/index.html#_moving_resources_due_to_connectivity_changes > BTW, I tried to find a solution on the bonding configuration which > disables the bond when no link is up, but I didn't find any. > > > > Tomer. > > > ___ > Users mailin

Re: [ClusterLabs] big trouble with a DRBD resource

2017-08-07 Thread Ken Gaillot
welt (GmbH) > Ingolstaedter Landstr. 1 > 85764 Neuherberg > www.helmholtz-muenchen.de > Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe > Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons > Enhsen > Regist

Re: [ClusterLabs] nginx resource - how to reload config or do a config test

2017-08-07 Thread Ken Gaillot
nodes? No, but that's OK. You don't want to start or stop or change the configuration without involving the cluster, but tests and checks are fine to run outside cluster control. > -- > Best Regards > > Przemysław Kulczycki > System administrator > Avaleo > > Email: u...@av

Re: [ClusterLabs] dry-run an alert?

2017-08-07 Thread Ken Gaillot
to do it by hand -- just set the environment variables to simulate the event you want and then call the agent. The possible environment variables are described at: http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_writing_an_alert_agent -- Ken Gail

Re: [ClusterLabs] ClusterLabs.Org Documentation Problem?

2017-08-22 Thread Ken Gaillot
updated, and it's mostly independent of the underlying layer, so you should prefer that set. I plan to reorganize that page in the coming months, so I'll try to make it clearer. -- Ken Gaillot <kgail...@redhat.com> ___ Users

Re: [ClusterLabs] Resources still retains in Primary Node even though its interface went down

2017-05-03 Thread Ken Gaillot
On 05/03/2017 02:43 AM, pillai bs wrote: > Hi Experts!!! > > Am having two node setup for HA (Primary/Secondary) with > separate resources for Home/data/logs/Virtual IP.. As known the Expected > behavior should be , if Primary node went down, secondary has to take > in-charge

Re: [ClusterLabs] Resources still retains in primary node

2017-05-03 Thread Ken Gaillot
On 05/03/2017 02:30 AM, pillai bs wrote: > Hi Experts!!! > > Am having two node HA setup (Primary/Secondary) with > separate resources for Home/data/logs/Virtual IP.. As known the Expected > behavior should be , if Primary node went down, secondary has to take > in-charge (meaning

Re: [ClusterLabs] How to check if a resource on a cluster node is really back on after a crash

2017-05-11 Thread Ken Gaillot
On 05/11/2017 03:00 PM, Ludovic Vaugeois-Pepin wrote: > Hi > I translated the a Postgresql multi state RA > (https://github.com/dalibo/PAF) in Python > (https://github.com/ulodciv/deploy_cluster), and I have been editing it > heavily. > > In parallel I am writing unit tests and functional tests.

Re: [ClusterLabs] newbie question

2017-05-11 Thread Ken Gaillot
On 05/05/2017 03:09 PM, Sergei Gerasenko wrote: > Hi, > > I have a very simple question. > > Pacemaker uses a dedicated "multicast" interface for the totem protocol. > I'm using pacemaker with LVS to provide HA load balancing. LVS uses > multicast interfaces to sync the status of TCP

Re: [ClusterLabs] how to set a dedicated fence delay for a stonith agent ?

2017-05-10 Thread Ken Gaillot
On 05/10/2017 12:20 AM, Kristoffer Grönlund wrote: > "Lentes, Bernd" writes: > >> - On May 8, 2017, at 9:20 PM, Bernd Lentes >> bernd.len...@helmholtz-muenchen.de wrote: >> >>> Hi, >>> >>> i remember that digimer often campaigns for a fence delay in a

Re: [ClusterLabs] cloned resources ordering and remote nodes problem

2017-05-09 Thread Ken Gaillot
On 04/13/2017 08:49 AM, Radoslaw Garbacz wrote: > Thank you, however in my case this parameter does not change the > described behavior. > > I have a more detail example: > order: res_A-clone -> res_B-clone -> res_C > when "res_C" is not on the node, which had "res_A" instance failed, it > will

Re: [ClusterLabs] Pacemaker's "stonith too many failures" log is not accurate

2017-05-17 Thread Ken Gaillot
On 05/17/2017 04:56 AM, Klaus Wenninger wrote: > On 05/17/2017 11:28 AM, 井上 和徳 wrote: >> Hi, >> I'm testing Pacemaker-1.1.17-rc1. >> The number of failures in "Too many failures (10) to fence" log does not >> match the number of actual failures. > > Well it kind of does as after 10 failures it

Re: [ClusterLabs] Pacemaker 1.1.17-rc1 now available

2017-05-09 Thread Ken Gaillot
On 05/09/2017 03:51 AM, Lars Ellenberg wrote: > Yay! > > On Mon, May 08, 2017 at 07:50:49PM -0500, Ken Gaillot wrote: >> "crm_attribute --pattern" to update or delete all node >> attributes matching a regular expression > > Just a nit, but "pattern

Re: [ClusterLabs] How to check if a resource on a cluster node is really back on after a crash

2017-05-12 Thread Ken Gaillot
nt DC: test1 (version 1.1.15-11.el7_3.4-e174ec8) - > partition with quorum > Last updated: Fri May 12 10:45:41 2017 Last change: Fri > May 12 10:45:39 2017 by root via crm_attribute on test1 > > 3 nodes and 4 resources configured > >

Re: [ClusterLabs] how to set a dedicated fence delay for a stonith agent ?

2017-05-10 Thread Ken Gaillot
On 05/10/2017 12:26 PM, Dimitri Maziuk wrote: > > i remember that digimer often campaigns for a fence delay in a 2-node > cluster. > ... > But ... a random delay does not seem to > be a reliable solution. > >> Some fence agents implement a delay parameter of their own, to set

Re: [ClusterLabs] ClusterIP won't return to recovered node

2017-06-12 Thread Ken Gaillot
On 06/10/2017 10:53 AM, Dan Ragle wrote: > So I guess my bottom line question is: How does one tell Pacemaker that > the individual legs of globally unique clones should *always* be spread > across the available nodes whenever possible, regardless of the number > of processes on any one of the

Re: [ClusterLabs] what is the best practice for removing a node temporary (e.g. for installing updates) ?

2017-06-19 Thread Ken Gaillot
On 06/19/2017 10:23 AM, Lentes, Bernd wrote: > Hi, > > what would you consider to be the best way for removing a node temporary from > the cluster, e.g. for installing updates ? > I thought "crm node maintenance node" would be the right way, but i was > astonished that the resources keep

Re: [ClusterLabs] Pacemaker 1.1.17 Release Candidate 4 (likely final)

2017-06-21 Thread Ken Gaillot
On 06/21/2017 02:58 AM, Ferenc Wágner wrote: > Ken Gaillot <kgail...@redhat.com> writes: > >> The most significant change in this release is a new cluster option to >> improve scalability. >> >> As users start to create clusters with hundreds of resourc

Re: [ClusterLabs] vip is not removed after node lost connection with the other two nodes

2017-06-23 Thread Ken Gaillot
On 06/23/2017 11:52 AM, Dimitri Maziuk wrote: > On 06/23/2017 11:24 AM, Jan Pokorný wrote: > >> People using ifdown or the iproute-based equivalent seem far >> too prevalent, even if for long time bystanders the idea looks >> continually disproved ad nauseam. > > Has anyone had a network card

[ClusterLabs] clusterlabs.org now supports https :-)

2017-06-26 Thread Ken Gaillot
pt.org/ [2] https://wiki.clusterlabs.org/ [3] https://bugs.clusterlabs.org/ -- Ken Gaillot <kgail...@redhat.com> ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.cluster

Re: [ClusterLabs] clearing failed actions

2017-06-21 Thread Ken Gaillot
8997] ctmgrpengine: info: LogActions: >>> Leave db- >>> mysql:1 (Slave ctdb2) >>> Jun 19 17:37:06 [18997] ctmgrpengine: notice: process_pe_message: >>> Calculated Transition 38: /var/lib/pacemaker/pengine/pe-input-16.bz2 >>> Jun 19 1

Re: [ClusterLabs] vip is not removed after node lost connection with the other two nodes

2017-06-23 Thread Ken Gaillot
On 06/22/2017 09:44 PM, Hui Xiang wrote: > Hi guys, > > I have setup 3 nodes(node-1, node-2, node-3) as controller nodes, an > vip is selected by pacemaker between them, after manually make the > management interface down in node-1 (used by corosync) but still have > connectivity to public or

Re: [ClusterLabs] both nodes OFFLINE

2017-05-22 Thread Ken Gaillot
On 05/13/2017 01:36 AM, 石井 俊直 wrote: > Hi. > > We have, sometimes, a problem in our two nodes cluster on CentOS7. Let node-2 > and node-3 > be the names of the nodes. When the problem happens, both nodes are > recognized OFFLINE > on node-3 and on node-2, only node-3 is recognized OFFLINE. > >

Re: [ClusterLabs] CIB: op-status=4 ?

2017-05-22 Thread Ken Gaillot
dbx_first_datas on > olegdbx39-vm02 to dbx_first_datas:1 > May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug: > find_anonymous_clone:Internally renamed dbx_swap_nodes on > olegdbx39-vm02 to dbx_swap_nodes:0 > May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug

[ClusterLabs] Pacemaker 1.1.17 Release Candidate 2

2017-05-23 Thread Ken Gaillot
configuration for constraints related to fence devices, to know whether to enable or disable them on the local node. Previously, after reading the initial configuration, it could detect later changes or removals of constraints, but not additions. Now, it can. -- Ken Gaillot <kgail...@redhat.

Re: [ClusterLabs] failcount is not getiing reset after failure_timeout if monitoring is disabled

2017-05-23 Thread Ken Gaillot
On 05/23/2017 08:00 AM, ashutosh tiwari wrote: > Hi, > > We are running a two node cluster(Active(X)/passive(Y)) having muliple > resources of type IpAddr2. > Running monitor operations for multiple IPAddr2 resource is actually > hoging the cpu, > as we have configured very low value for monitor

Re: [ClusterLabs] resource monitor logging

2017-05-25 Thread Ken Gaillot
On 05/24/2017 03:44 PM, Christopher Pax wrote: > > I am running postgresql as a resource in corosync. and there is a > monitor process that kicks off every few seconds to see if postgresqlis > alive (it runs a select now()). My immediate concernis that it is > generating alotof logs in auth.log,

Re: [ClusterLabs] In N+1 cluster, add/delete of one resource result in other node resources to restart

2017-05-19 Thread Ken Gaillot
red > > Online: [ 0005B94238BC 0005B9423910 0005B9427C5A ] > > > Observation is remaining two resources res2 and res1 were stopped > and started. > > > Regards, > Aswathi > > On Mon, May 15, 2017 at 8:11 PM, Ken Gaillot <kgail..

Re: [ClusterLabs] CIB: op-status=4 ?

2017-05-18 Thread Ken Gaillot
On 05/17/2017 06:10 PM, Radoslaw Garbacz wrote: > Hi, > > I have a question regarding ' 'op-status > attribute getting value 4. > > In my case I have a strange behavior, when resources get those "monitor" > operation entries in the CIB with op-status=4, and they do not seem to > be called

Re: [ClusterLabs] question about fence-virsh

2017-05-19 Thread Ken Gaillot
On 05/19/2017 03:47 PM, Andrew Kerber wrote: > What I am trying to say here is when I get one of the virtual machines > in a bad state, I can still log in and reboot it with the reboot > command. But I need my fencing resource to handle that reboot. > > On Fri, May 19, 2017 at 1:32 PM, Andrew

Re: [ClusterLabs] In N+1 cluster, add/delete of one resource result in other node resources to restart

2017-05-22 Thread Ken Gaillot
ated: Tue May 16 12:21:27 2017 Last change: Tue May 16 > 12:21:26 2017 by root via cibadmin on 0005B94238BC > > 3 nodes and 2 resources configured > > Online: [ 0005B94238BC 0005B9423910 0005B9427C5A ] > > > Observation is remaining two resources res2 and res1 wer

Re: [ClusterLabs] clearing failed actions

2017-05-30 Thread Ken Gaillot
On 05/30/2017 09:13 AM, Attila Megyeri wrote: > Hi, > > > > Shouldn’t the > > > > cluster-recheck-interval="2m" > > > > property instruct pacemaker to recheck the cluster every 2 minutes and > clean the failcounts? It instructs pacemaker to recalculate whether any actions need to be

Re: [ClusterLabs] clearing failed actions

2017-05-31 Thread Ken Gaillot
On 05/30/2017 02:50 PM, Attila Megyeri wrote: > Hi Ken, > > >> -Original Message----- >> From: Ken Gaillot [mailto:kgail...@redhat.com] >> Sent: Tuesday, May 30, 2017 4:32 PM >> To: users@clusterlabs.org >> Subject: Re: [ClusterLabs] clearing fail

Re: [ClusterLabs] Pacemaker's "stonith too many failures" log is not accurate

2017-05-31 Thread Ken Gaillot
t;call_options & > st_opt_sync_call, FALSE); > + > /* mark this op as having notify's already sent */ > op->notify_sent = TRUE; > free_xml(reply); > > Regards, > Kazunori INOUE > >> -Original Message- >> From: Ken Gaillot [mailto:

Re: [ClusterLabs] Node attribute disappears when pacemaker is started

2017-05-31 Thread Ken Gaillot
On 05/26/2017 03:21 AM, 井上 和徳 wrote: > Hi Ken, > > I got crm_report. > > Regards, > Kazunori INOUE I don't think it attached -- my mail client says it's 0 bytes. >> -Original Message----- >> From: Ken Gaillot [mailto:kgail...@redhat.com] >> Sent: Frida

Re: [ClusterLabs] crm_resource -c field

2017-06-07 Thread Ken Gaillot
On 06/05/2017 10:19 AM, iva...@libero.it wrote: > Hello, > could you explain the meaning of fields in "crm_resource -c" command (c > in lowercase)? > > I've tried to search on web but i didn't find anything. > > Thanks and regards > > Ivan It's used solely by pacemaker's cluster test suite

Re: [ClusterLabs] Cloned IP not moving back after node restart or standby

2017-06-07 Thread Ken Gaillot
On 06/02/2017 06:33 AM, Takehiro Matsushima wrote: > Hi, > > You should not clone IPaddr2 resource. > Clone means that to run the resource at same time on both nodes, so > these nodes will have same duplicated IP address on a network. > > Specifically, you need to configure a IPaddr2 resource

Re: [ClusterLabs] Node attribute disappears when pacemaker is started

2017-06-08 Thread Ken Gaillot
Hub, so look at it. > https://github.com/inouekazu/pcmk_report/blob/master/pcmk-Fri-26-May-2017.tar.bz2 > >> -Original Message----- >> From: Ken Gaillot [mailto:kgail...@redhat.com] >> Sent: Thursday, June 01, 2017 8:43 AM >> To: users@clusterlabs.org >> Subject:

Re: [ClusterLabs] Pacemaker shutting down peer node

2017-06-15 Thread Ken Gaillot
On 06/15/2017 12:38 AM, Jaz Khan wrote: > Hi, > > I have been encountering this serious issue from past couple of months. > I really have no idea that why pacemaker sends shutdown signal to peer > node and it goes down. This is very strange and I am too much worried . > > This is not happening

Re: [ClusterLabs] ClusterIP won't return to recovered node

2017-06-16 Thread Ken Gaillot
On 06/16/2017 01:18 PM, Dan Ragle wrote: > > > On 6/12/2017 10:30 AM, Ken Gaillot wrote: >> On 06/12/2017 09:23 AM, Klaus Wenninger wrote: >>> On 06/12/2017 04:02 PM, Ken Gaillot wrote: >>>> On 06/10/2017 10:53 AM, Dan Ragle wrote: >>>>> So I

Re: [ClusterLabs] Pacemaker shutting down peer node

2017-06-16 Thread Ken Gaillot
peer ha-apex2 > is complete > > > Best regards, > Jaz > > > > > > Message: 1 > Date: Thu, 15 Jun 2017 13:53:00 -0500 > From: Ken Gaillot <kgail...@redhat.com <mailto:kgail...@redhat.com>> > To: users@clusterlabs.org <mail

Re: [ClusterLabs] How to fence cluster node when SAN filesystem fail

2017-05-02 Thread Ken Gaillot
Hi, Upstream documentation on fencing in Pacemaker is available at: http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm139683949958512 Higher-level tools such as crm shell and pcs make it easier; see their man pages and other documentation for

Re: [ClusterLabs] crm_mon -h (writing to a html-file) not showing all desired information and having trouble with the -d option

2017-05-08 Thread Ken Gaillot
On 05/08/2017 11:13 AM, Lentes, Bernd wrote: > Hi, > > playing around with my cluster i always have a shell with crm_mon running > because it provides me a lot of useful and current information concerning > cluster, nodes, resources ... > Normally i have a "crm_mon -nrfRAL" running. > I'd like

Re: [ClusterLabs] Instant service restart during failback

2017-05-08 Thread Ken Gaillot
If you look in the logs when the node comes back, there should be some "pengine:" messages noting that the restarts will be done, and then a "saving inputs in " message. If you can attach that file (both with and without the constraint changes would be ideal), I'll take a look at it. On

Re: [ClusterLabs] pacemaker daemon shutdown time with lost remote node

2017-05-08 Thread Ken Gaillot
On 04/28/2017 02:22 PM, Radoslaw Garbacz wrote: > Hi, > > I have a question regarding pacemaker daemon shutdown > procedure/configuration. > > In my case, when a remote node is lost pacemaker needs exactly 10minutes > to shutdown, during which there is nothing logged. > So my questions: > 1.

Re: [ClusterLabs] Antw: Antw: notice: throttle_handle_load: High CPU load detected

2017-05-08 Thread Ken Gaillot
: > High CPU load detected > > > > Thank you, Ken. > > This helps a lot. > > Now I am sure that my current approach fits best for me =) > > > Thank you, > > Kostia > > > > On Wed, Mar 30, 2016 at 11:10 PM, Ken Gaillot <kgail...@

Re: [ClusterLabs] Antw: Behavior after stop action failure with the failure-timeout set and STONITH disabled

2017-05-08 Thread Ken Gaillot
On 05/05/2017 07:49 AM, Jan Wrona wrote: > On 5.5.2017 08:15, Ulrich Windl wrote: > Jan Wrona schrieb am 04.05.2017 um 16:41 in > Nachricht >> : >>> I hope I'll be able to explain the problem clearly and correctly. >>> >>>

Re: [ClusterLabs] stonith device locate on same host in active/passive cluster

2017-05-04 Thread Ken Gaillot
On 05/03/2017 09:04 PM, Albert Weng wrote: > Hi Marek, > > Thanks your reply. > > On Tue, May 2, 2017 at 5:15 PM, Marek Grac > wrote: > > > > On Tue, May 2, 2017 at 11:02 AM, Albert Weng

[ClusterLabs] Pacemaker 1.1.17-rc1 now available

2017-05-08 Thread Ken Gaillot
possible use cases, so your feedback is important and appreciated. Many thanks to all contributors of source code to this release, including Alexandra Zhuravleva, Andrew Beekhof, Aravind Kumar, Eric Marques, Ferenc Wágner, Yan Gao, Hayley Swimelar, Hideo Yamauchi, Igor Tsiglyar, Jan Pokorný

Re: [ClusterLabs] Node attribute disappears when pacemaker is started

2017-05-25 Thread Ken Gaillot
On 05/24/2017 05:13 AM, 井上 和徳 wrote: > Hi, > > After loading the node attribute, when I start pacemaker of that node, the > attribute disappears. > > 1. Start pacemaker on node1. > 2. Load configure containing node attribute of node2. >(I use multicast addresses in corosync, so did not set

Re: [ClusterLabs] clearing failed actions

2017-05-31 Thread Ken Gaillot
On 05/31/2017 12:17 PM, Ken Gaillot wrote: > On 05/30/2017 02:50 PM, Attila Megyeri wrote: >> Hi Ken, >> >> >>> -Original Message- >>> From: Ken Gaillot [mailto:kgail...@redhat.com] >>> Sent: Tuesday, May 30, 2017 4:32 PM >>>

[ClusterLabs] Pacemaker 1.1.17 Release Candidate 3

2017-05-31 Thread Ken Gaillot
Remote node unless necessary. Testing and feedback is welcome! -- Ken Gaillot <kgail...@redhat.com> ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting s

Re: [ClusterLabs] Pacemaker occasionally takes minutes to respond

2017-05-31 Thread Ken Gaillot
On 05/24/2017 08:04 AM, Attila Megyeri wrote: > Hi Klaus, > > Thank you for your response. > I tried many things, but no luck. > > We have many pacemaker clusters with 99% identical configurations, package > versions, and only this one causes issues. (BTW we use unicast for corosync, > but

Re: [ClusterLabs] How to avoid stopping ordered resources on cleanup?

2017-09-15 Thread Ken Gaillot
19ebe57c" transition-magic="0:0;6:1380:0:e2c19428-0707-4677- > a89a-ff1c19ebe57c" on_node="bam2-backend" call-id=" > Sep 13 06:43:55 [3826] bam1-omc    cib: info: > cib_process_request:  Completed cib_modify operation for section > status: OK

Re: [ClusterLabs] Force stopping the resources from a resource group in parallel

2017-09-15 Thread Ken Gaillot
erlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch. > pdf > Bugs: http://bugs.clusterlabs.org -- Ken Gaillot <kgail...@redhat.com> ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/lis

Re: [ClusterLabs] Cannot stop cluster due to order constraint

2017-09-15 Thread Ken Gaillot
aint order start main5 then stop backup5 kind=Serialize > pcs constraint order start main6 then stop backup6 kind=Serialize > > pcs constraint colocation add backup1 with main1 -200 > pcs constraint colocation add backup2 with main2 -200 > pcs constraint colocation ad

Re: [ClusterLabs] IP clone issue

2017-09-15 Thread Ken Gaillot
   ClusterIP:1(ocf::heartbeat:IPaddr2):Started > node01 > > > > But if one node fails the IP resource is not migrated to > active > > node as is said in documentation. > > > > Clone Set: ClusterIP-clone [ClusterIP] (unique) > >   

[ClusterLabs] Disabling stonith in Pacemaker 2.0 (was: Re: Pacemaker 1.1.18 deprecation warnings)

2017-09-18 Thread Ken Gaillot
On Mon, 2017-09-18 at 13:53 -0400, Digimer wrote: > On 2017-09-18 01:48 PM, Ken Gaillot wrote: > > As discussed at the recent ClusterLabs Summit, I plan to start the > > release cycle for Pacemaker 1.1.18 soon. > >  > > There will be the usual bug fixes a

[ClusterLabs] Pacemaker 1.1.18 deprecation warnings

2017-09-18 Thread Ken Gaillot
reshold) * undocumented and ignored -r option to lrmd * compile-time option to use undocumented "notification-agent" and "notification-recipient" cluster properties instead of current "alerts" syntax * compatibility with CIB schemas below 1.0, and schema 1.1 (should not

Re: [ClusterLabs] Antw: Pacemaker 1.1.18 deprecation warnings

2017-09-19 Thread Ken Gaillot
On Tue, 2017-09-19 at 09:13 +0200, Ulrich Windl wrote: > >>> Ken Gaillot <kgail...@redhat.com> schrieb am 18.09.2017 um 19:48 > in Nachricht > <1505756918.5541.4.ca...@redhat.com>: > > As discussed at the recent ClusterLabs Summit, I plan to start the > &g

Re: [ClusterLabs] New website design and new-new logo

2017-09-21 Thread Ken Gaillot
On Thu, 2017-09-21 at 09:26 +0200, Jehan-Guillaume de Rorthais wrote: > On Wed, 20 Sep 2017 21:25:51 -0400 > Digimer <li...@alteeve.ca> wrote: > > > On 2017-09-20 07:53 PM, Ken Gaillot wrote: > > > Hi everybody, > > >  > > > We've started a major

Re: [ClusterLabs] New website design and new-new logo

2017-09-21 Thread Ken Gaillot
On Wed, 2017-09-20 at 21:25 -0400, Digimer wrote: > On 2017-09-20 07:53 PM, Ken Gaillot wrote: > > Hi everybody, > >  > > We've started a major update of the ClusterLabs web design. The > main > > goal (besides making it look more modern) is to make the top-level &g

Re: [ClusterLabs] Pacemaker 1.1.18 deprecation warnings

2017-09-20 Thread Ken Gaillot
On Wed, 2017-09-20 at 11:48 +0200, Ferenc Wágner wrote: > Ken Gaillot <kgail...@redhat.com> writes: > > > * undocumented LRMD_MAX_CHILDREN environment variable > > (PCMK_node_action_limit is the current syntax) > > By the way, is the current syntax

Re: [ClusterLabs] some resources move after recovery

2017-09-20 Thread Ken Gaillot
> Roberto Muñoz > BME - Sistemas UNIX > C/ Tramontana, 2 Bis. Edificio 2 - 1ª Planta > 28230 Las Rozas, Madrid - España > Tlfn: +34-917095778 > > > P Antes de imprimir, piensa en el MEDIO AMBIENTE > AVISO LEGAL/DISCLAIMER -- Ken Gaillot <kgail...@redhat.com> _

Re: [ClusterLabs] New website design and new-new logo

2017-09-21 Thread Ken Gaillot
On Thu, 2017-09-21 at 11:56 +0200, Kai Dupke wrote: > On 09/21/2017 01:53 AM, Ken Gaillot wrote: > > Check it out at https://clusterlabs.org/ > > Two comments > > - I would like to see the logo used by as many > people/projects/marketingers, so I propose to link t

Re: [ClusterLabs] New website design and new-new logo

2017-09-21 Thread Ken Gaillot
On Thu, 2017-09-21 at 16:46 +0200, Kai Dupke wrote: > On 09/21/2017 04:42 PM, Ken Gaillot wrote: > > Yes, the FAQ needs an overhaul as well -- all the Pacemaker- > specific > > questions should be moved to a separate Pacemaker FAQ, and the top > FAQ > > should just have

[ClusterLabs] New website design and new-new logo

2017-09-20 Thread Ken Gaillot
-- Kristoffer Grönlund had a professional designer look at the one he created. I hope everyone likes the end result. It's simpler, cleaner and friendlier. Check it out at https://clusterlabs.org/ -- Ken Gaillot <kgail...@redhat.com> ___ Users mailin

Re: [ClusterLabs] High CPU during CIB sync

2017-09-15 Thread Ken Gaillot
shouldn't change its own configuration.) You should be able to reduce the CPU usage by setting "dampening" on the node attributes. This will make the cluster wait a bit of time before writing node attribute changes to the CIB, so the recalculation doesn't have to occur immediately

Re: [ClusterLabs] Pacemaker resource parameter reload confusion

2017-09-22 Thread Ken Gaillot
nted in the end, and I can't find anything in the changelog > either.  So, what do I miss here?  Parallel reload and stop looks > rather > suspicious, though... Nothing's been done about reload yet. It's waiting until we get around to an overhaul of the OCF resource agent stand

Re: [ClusterLabs] Pacemaker 1.1.18 deprecation warnings

2017-09-19 Thread Ken Gaillot
uld be a log for non > > existentalert-agents prior to their unsuccessful first use. -- Ken Gaillot <kgail...@redhat.com> ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.

Re: [ClusterLabs] Pacemaker 1.1.18 deprecation warnings

2017-09-19 Thread Ken Gaillot
s for that, but it will be later than 2.0. > On 18.9.2017 12:48:38 Ken Gaillot wrote: > > As discussed at the recent ClusterLabs Summit, I plan to start the > > release cycle for Pacemaker 1.1.18 soon. > >  > > There will be the usual bug fixes and a few small new feat

[ClusterLabs] Pacemaker 1.1.18-rc1 now available

2017-10-06 Thread Ken Gaillot
and appreciated. Many thanks to all contributors of source code to this release, including Andrew Beekhof, Aravind Kumar, Artur Novik, Bin Liu, Yan Gao, Hideo Yamauchi, Igor Tsiglyar, Jan Pokorný, Ken Gaillot, Klaus Wenninger, Nye Liu, and Valentin Vidic. -- Ken Gaillot <kgail...@redhat.

Re: [ClusterLabs] crm_resource --wait

2017-10-09 Thread Ken Gaillot
On Mon, 2017-10-09 at 16:37 +1000, Leon Steffens wrote: > Hi all, > > We have a use case where we want to place a node into standby and > then wait for all the resources to move off the node (and be started > on other nodes) before continuing.   > > In order to do this we call: > $ pcs cluster

Re: [ClusterLabs] corosync service not automatically started

2017-10-10 Thread Ken Gaillot
On Tue, 2017-10-10 at 12:24 +0200, Václav Mach wrote: > On 10/10/2017 11:40 AM, Valentin Vidic wrote: > > On Tue, Oct 10, 2017 at 11:26:24AM +0200, Václav Mach wrote: > > > # The primary network interface > > > allow-hotplug eth0 > > > iface eth0 inet dhcp > > > # This is an autoconfigured IPv6

Re: [ClusterLabs] crm_resource --wait

2017-10-10 Thread Ken Gaillot
On Tue, 2017-10-10 at 15:19 +1000, Leon Steffens wrote: > Hi Ken, > > I managed to reproduce this on a simplified version of the cluster, > and on Pacemaker 1.1.15, 1.1.16, as well as 1.1.18-rc1 > The steps to create the cluster are: > > pcs property set stonith-enabled=false > pcs property set

[ClusterLabs] Pacemaker 1.1.18 Release Candidate 2

2017-10-16 Thread Ken Gaillot
testing you can do is very welcome. -- Ken Gaillot <kgail...@redhat.com> ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterla

Re: [ClusterLabs] When resource fails to start it stops an apparently unrelated resource

2017-10-16 Thread Ken Gaillot
if the agent returns "failed" for both resources when either one fails, you could see something like that. I'd look at the logs on the DC and see why it decided to restart the second resource. -- Ken Gaillot <kgail...@redhat.com> ___

Re: [ClusterLabs] When resource fails to start it stops an apparently unrelated resource

2017-10-17 Thread Ken Gaillot
... I'd recommend writing your own OCF agent tailored to your service. It's not much more complicated than an init script. > On Mon, Oct 16, 2017 at 6:57 PM, Ken Gaillot <kgail...@redhat.com> > wrote: > > On Mon, 2017-10-16 at 18:30 +0200, Gerard Garcia wrote: > > > Hi, &g

Re: [ClusterLabs] Debugging problems with resource timeout without any actions from cluster

2017-10-17 Thread Ken Gaillot
On Tue, 2017-10-17 at 15:30 +0600, Sergey Korobitsin wrote: > Ken Gaillot ☫ → To Cluster Labs - All topics related to open-source > clustering welcomed @ Thu, Oct 12, 2017 09:47 -0500 > > Thanks for the answer, Ken, > > > > I found several ways to achieve that: >

Re: [ClusterLabs] Pacemaker resource parameter reload confusion

2017-10-17 Thread Ken Gaillot
On Fri, 2017-09-22 at 18:30 +0200, Ferenc Wágner wrote: > Ken Gaillot <kgail...@redhat.com> writes: > > > Hmm, stop+reload is definitely a bug. Can you attach (or email it > > to me > > privately, or file a bz with it attached) the above pe-input file > >

Re: [ClusterLabs] Mysql upgrade in DRBD setup

2017-10-13 Thread Ken Gaillot
kes more sense if the mysql servers are running inside VMs or containers that can migrate between the physical machines. > > Thanks! > > > > -Original Message- > From: Ken Gaillot [mailto:kgail...@redhat.com]  > Sent: Thursday, October 12, 2017 9:22 PM > To: Clust

Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?

2017-10-16 Thread Ken Gaillot
ing corosync, in any case. In maintenance mode, that should be fine. I don't think a running pacemaker would be able to reconnect to corosync after corosync comes back. > What are you really trying to do, > what is the reason you need it in maintenance-mode > and stop pacemaker/corosync/ope

Re: [ClusterLabs] Corosync on a home network

2017-09-12 Thread Ken Gaillot
mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org -- Ken Gaillot <kgail...@redhat.com> _

[ClusterLabs] 2017 ClusterLabs Summit -- Pacemaker 1.2.0 or 2.0 talk

2017-09-06 Thread Ken Gaillot
his email is to start a discussion about these changes. Nothing is set in stone. We do want to focus more on removing legacy usage rather than adding new features in the 2.0 release. Anyone who has an opinion or questions about the changes mentioned above, or suggestions for similar changes, is encouraged

[ClusterLabs] Coming in 1.1.18: deprecating stonith-enabled

2017-09-25 Thread Ken Gaillot
et to "fence" (the default for stop operations), or any fence resource has been configured. If fencing is not possible, the cluster will behave as if stonith- enabled is false (even if it's not). -- Ken Gaillot <kgail...@redhat.com> ___ Users m

Re: [ClusterLabs] Transition aborted when disabling resource

2017-09-30 Thread Ken Gaillot
t; match_graph_event:   Action cftmd1s1_start_0 (513) confirmed on > > vttwinformlrz2 (rc=0) > > Sep 25 23:50:08 [4495] vttwinformlrz1 stonith-ng: info: > > update_cib_stonith_devices_v2:   Updating device list from the > > cib: modify lrm_rsc_op[@id='cftmd1s1_la

Re: [ClusterLabs] Pacemaker starts with error on LVM resource

2017-10-01 Thread Ken Gaillot
On Thu, 2017-09-28 at 18:05 +0300, Octavian Ciobanu wrote: > Hello all. > > I have a test configuration with 2 nodes that is configured as iSCSI > storage. > > I've created a master/slave DRBD resource and a group that has the > following resources ordered as follow :  >  - iSCSI TCP IP/port

Re: [ClusterLabs] strange behaviour from pacemaker_remote

2017-10-01 Thread Ken Gaillot
On Thu, 2017-09-28 at 01:39 +0200, Adam Spiers wrote: > Hi all, > > When I do a > > pkill -9 -f pacemaker_remote > > to simulate failure of a remote node, sometimes I see things like: > > 08:29:32 d52-54-00-da-4e-05 pacemaker_remoted[5806]: error: No > ipc providers available for uid 0

Re: [ClusterLabs] Restarting a failed resource on same node

2017-10-03 Thread Ken Gaillot
On Mon, 2017-10-02 at 12:32 -0700, Paolo Zarpellon wrote: > Hi, > on a basic 2-node cluster, I have a master-slave resource where > master runs on a node and slave on the other one. If I kill the slave > resource, the resource status goes to "stopped". > Similarly, if I kill the the master

Re: [ClusterLabs] what does cluster do when 'resourceA with resourceB' happens

2017-10-03 Thread Ken Gaillot
On Tue, 2017-10-03 at 11:53 +0100, lejeczek wrote: > hi > > I'm reading "An A-Z guide to Pacemaker's Configurations  > Options" and in there it read: > "... > So when you are creating colocation constraints, it is  > important to consider whether you should > colocate A with B, or B with A. >

Re: [ClusterLabs] ClusterLabs.Org Documentation Problem?

2017-08-24 Thread Ken Gaillot
used with corosync 1.x as > a configuration layer and (more important) quorum provider. With Corosync 2.x > quorum provider is already in corosync so no need for cman. > > > > > > > -- > > Eric Robinson > > > > -Original Message- > > From

Re: [ClusterLabs] start one node only?

2017-08-24 Thread Ken Gaillot
or_all back to 1 once your cluster is back to normal. -- Ken Gaillot <kgail...@redhat.com> ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org

Re: [ClusterLabs] start one node only?

2017-08-24 Thread Ken Gaillot
On Thu, 2017-08-24 at 15:53 -0500, Dimitri Maziuk wrote: > On 08/24/2017 03:40 PM, Ken Gaillot wrote: > > > You could set wait_for_all to 0 in corosync.conf, then boot. The living > > node should try to fence the other one, and proceed if fencing succeeds. > > Did

Re: [ClusterLabs] Pacemaker in Azure

2017-08-24 Thread Ken Gaillot
ted: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > > > > > ___ > Users mailing list: Users@clusterlabs.org > http://lists.clust

Re: [ClusterLabs] Antw: Retries before setting fail-count to INFINITY

2017-08-21 Thread Ken Gaillot
On Mon, 2017-08-21 at 15:39 +0200, Ulrich Windl wrote: > >>> Vaibhaw Pandey schrieb am 21.08.2017 um 14:58 in > Nachricht >

Re: [ClusterLabs] Pacemaker stopped monitoring the resource

2017-08-31 Thread Ken Gaillot
html-single/Pacemaker_Explained/index.html#ap-ocf ); then add your resource to the cluster without migration-threshold or failure-timeout, and work out any issues with frequent failures; then finally set migration-threshold and failure-timeout to reflect how you want recovery to proceed. -- Ke

Re: [ClusterLabs] VirtualDomain live migration error

2017-08-31 Thread Ken Gaillot
> Anybody has experienced the same issue? > > > Thanks in advance for your help If something works from the command line but not when run by a daemon, my first suspicion is SELinux. Check the audit log for denials around that time. I'd also check the system log and Pacemaker detail

<    1   2   3   4   5   6   7   8   9   10   >