Re: [ClusterLabs] migrating VirtualDomain starts migration of second VirtualDomain in reverse direction

2018-09-20 Thread Ken Gaillot
lts > > section (however that's done using your tools of choice). > > -- > > Ken Gaillot > > Hi, > > i did it that way: > > configure show > ... > rsc_defaults rsc-options: \ > default-resource-stickiness=200 OK, drop "default-" an

Re: [ClusterLabs] migrating VirtualDomain starts migration of second VirtualDomain in reverse direction

2018-09-20 Thread Ken Gaillot
ly. > Or is the value too low ? > > Bernd I think you meant default-resource-stickiness ... and even that's deprecated in 1.1 and gone in 2.0. :-) The proper way is to set resource-stickiness in the rsc_defaults section (however that's done using your tools of choice). -- Ken Gaillot __

Re: [ClusterLabs] pcs API / python module

2018-09-20 Thread Ken Gaillot
acemaker command-line tools (crm_resource, crm_node, etc.). Those interfaces are quite stable, and you could make python wrappers for executing them. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinf

Re: [ClusterLabs] [Question and Request] QUERY behavior of glue's plugin.

2018-09-12 Thread Ken Gaillot
_host_check="status" \ > (snip) Yes, that will do it. > > Also, if this setting is correct, there is no document for "status" > setting. > >  - http://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Pace > maker_Explained/_differences_of_

Re: [ClusterLabs] Complex Pacemaker resource depedency

2018-09-11 Thread Ken Gaillot
es always come up after at  > least one ip is running on the servers? > > > Regards, > > Vass If I understand the question correctly, you want resource sets (in an ordering constraint, with require-all=false): http://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html-sing

Re: [ClusterLabs] Non-cloned resource moves before cloned resource startup on unstandby

2018-09-11 Thread Ken Gaillot
erver_start_0 locally on node2.mydomain.com > Sep  7 15:03:31 node2 crmd[58188]:   notice: Initiating start > operation SharedRootCrons_start_0 locally on node2.mydomain.com > Sep  7 15:03:31 node2 crmd[58188]:   notice: Initiating start > operation SharedUserCrons_start_0 locall

Re: [ClusterLabs] About fencing stonith

2018-09-06 Thread Ken Gaillot
onfigure primitive RADIUS-IP ocf:heartbeat:IPaddr2 \ > params ip="192.168.0.9" nic="eth0" cidr_netmask="24" \ > op monitor interval=10s timeout=20s > crm configure primitive RADIUS lsb:freeradius op monitor interval=10s > timeout=20s > crm configure clone RADI

Re: [ClusterLabs] Antw: Rebooting a standby node triggers lots of transitions

2018-09-05 Thread Ken Gaillot
t; Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratc > > h.pdf  > > Bugs: http://bugs.clusterlabs.org  > > > > ___________ > Users mailing list: Users@clusterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/

Re: [ClusterLabs] Pacemaker startup retries

2018-09-05 Thread Ken Gaillot
rom source, you can apply the patch that fixes the issue to the 1.1.14 code base: https://github.com/ClusterLabs/pacemaker/commit/98457d1635db1222f93599b6021e662e766ce62d -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://list

Re: [ClusterLabs] Pacemaker startup retries

2018-09-05 Thread Ken Gaillot
On Wed, 2018-09-05 at 09:51 -0500, Ken Gaillot wrote: > On Wed, 2018-09-05 at 16:38 +0200, Cesar Hernandez wrote: > > Hi > > > > > > > > Ah, this rings a bell. Despite having fenced the node, the > > > cluster > > > still conside

Re: [ClusterLabs] Pacemaker startup retries

2018-09-05 Thread Ken Gaillot
t; > >  Oh :( I'm using Pacemaker-1.1.14. > Do you know if this reboot retries are just run 3 times? All the > tests I've done the rebooting is finished after 3 times. > > Thanks > Cesar No, if I remember correctly, it would just keep going until

Re: [ClusterLabs] Pacemaker startup retries

2018-09-05 Thread Ken Gaillot
; Filesystem(p_fs_datosweb)[962]: 2018/08/31_11:00:05 INFO: Running > start for /dev/drbd/by-res/datoswebstorage on /mnt/datosweb > Filesystem(p_fs_database)[961]: 2018/08/31_11:00:05 INFO: Running > start for /dev/drbd/by-res/databasestorage on /mnt/database > > > .. > > > Can

Re: [ClusterLabs] Antw: Re: Antw: Q: native_color scores for clones

2018-09-05 Thread Ken Gaillot
On Wed, 2018-09-05 at 09:32 +0200, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 04.09.2018 um > > > > 19:21 in Nachricht > > <1536081690.4387.6.ca...@redhat.com>: > > On Tue, 2018-09-04 at 11:22 +0200, Ulrich Windl wrote: > > >

Re: [ClusterLabs] Pacemaker startup retries

2018-08-31 Thread Ken Gaillot
the cluster itself) or some external program. If it's the cluster, I'd look at the "pengine:" logs on the DC before that, to see if there are any hints (node unclean, etc.). Then keep going backward until the ultimate cause is found. -- Ken Gaillot

Re: [ClusterLabs] Pacemaker startup retries

2018-08-30 Thread Ken Gaillot
\ > default-resource-stickiness=200 \ > stonith-timeout=180s \ > last-lrm-refresh=1534489943 > > > Thanks > > César Hernández Bañó -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org htt

Re: [ClusterLabs] Q: native_color scores for clones

2018-08-30 Thread Ken Gaillot
uot; resource has scores 0, 1, and -INFINITY, and the > ":1" resource has score 1 once and -INFINITY twice. > > When I look at the "clone_solor" scores, the prm_DLM:* primitives > look as expected (no -INFINITY). However the cln_DLM clones have > score like 1,

Re: [ClusterLabs] Antw: Re: Q: ordering clones with interleave=false

2018-08-30 Thread Ken Gaillot
On Thu, 2018-08-30 at 08:28 +0200, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 29.08.2018 um > > > > 20:30 in > > Nachricht > <1535567455.5594.5.ca...@redhat.com>: > > On Wed, 2018‑08‑29 at 13:30 +0200, Ulrich Windl wrote: > > > Hi! &

Re: [ClusterLabs] Q: ordering clones with interleave=false

2018-08-29 Thread Ken Gaillot
ifference, whether the resource cannot run on an online > node, or is unable due to a standby or offline node? > > Regards, > Ulrich Interleave=false only applies to instances that will be started in the current transition, so offline nodes don't prevent dependent resources from sta

Re: [ClusterLabs] Q: Resource Groups vs Resources for stickiness and colocation?

2018-08-29 Thread Ken Gaillot
6-12.el7.x86_64) > > /Ian This sounds like a bug. Feel free to submit a report at bugs.clusterlabs.org and attach the policy engine input file with the unexpected behavior. FYI a group's stickiness is the sum of the stickiness of each active member, though no score can be bigger than I

Re: [ClusterLabs] Redundant ring not recovering after node is back

2018-08-24 Thread Ken Gaillot
t; Here you have some of my configuration settings on node 1 > > > > > > > (I probed > > > > > > > already > > > > > > > to change rrp_mode): > > > > > > > > > > > > > > *- corosync.conf* > > > > > > > > > > > > > > > > >

Re: [ClusterLabs] Q: (SLES11 SP4) lrm_rsc_op without last-run?

2018-08-23 Thread Ken Gaillot
t; The node is not completely up-to-date, and it's using pacemaker- > 1.1.12-18.1... > > Regards, > Ulrich > > > _______ > Users mailing list: Users@clusterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/users > > Proj

Re: [ClusterLabs] Q: automaticlly remove expired location constraints

2018-08-23 Thread Ken Gaillot
7:26Z" > > One problem is that the date value is not a constant, and it had to > be compared against the current date > > Regards, > Ulrich crm_resource --clear -r RSC will clear all cli-* constraints -- Ken Gaillot ___ U

Re: [ClusterLabs] Antw: Re: Spurious node loss in corosync cluster

2018-08-21 Thread Ken Gaillot
device utilization, then look for > network bottlenecks... > > A new corosync release cannot fix those, most likely. > > Regards, > Ulrich > > > > > In any case, for the current scenario, we did not see any > > scheduling > > related messages. > > &

Re: [ClusterLabs] Different Times in the Corosync Log?

2018-08-21 Thread Ken Gaillot
g a > very > short interval sequentially (i.e. no intermittent failure recovered > with > a restart of lrmd, AFAICT).  In case it can have any bearing, how do > you start pacemaker -- systemd, initscript, as a corosync plugin, > something else? -- Ken Gaillot

Re: [ClusterLabs] Antw: Re: Q: ordering for a monitoring op only?

2018-08-21 Thread Ken Gaillot
On Tue, 2018-08-21 at 07:49 +0200, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 20.08.2018 um > > > > 16:49 in > > Nachricht > <1534776566.6465.5.ca...@redhat.com>: > > On Mon, 2018‑08‑20 at 10:51 +0200, Ulrich Windl wrote: > > >

Re: [ClusterLabs] Q: Stickyness across resource restarts

2018-08-20 Thread Ken Gaillot
. I haven't tested the behavior with older versions to see if it's different though. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getti

Re: [ClusterLabs] Q: ordering for a monitoring op only?

2018-08-20 Thread Ken Gaillot
had > written a monitor for HP-UX' cluster that did not have this problem, > even though the configuration files were read from NFS (It's not > magic: Just periodically copy them to shared memory, and read the > config from shared memory). > > Regards, > Ulrich -- Ken

Re: [ClusterLabs] Q: Forcing a role change of master/slave resource

2018-08-20 Thread Ken Gaillot
ing "-G" to see the current scores or "-v " to change them. The node with the highest score will be promoted. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users

Re: [ClusterLabs] Spurious node loss in corosync cluster

2018-08-20 Thread Ken Gaillot
    } > } > service { >     name: pacemaker >     ver: 1 > } > amf { >     mode: disabled > } > > Thanks in advance for the help. > Prasad > > ___ > Users mailing list: Users@clusterlabs.org > https://lists.clusterlabs.

Re: [ClusterLabs] Pacemaker confirm that node was fenced successfully

2018-08-13 Thread Ken Gaillot
Praze > > Banka: Fio banka a.s. > Číslo účtu: 2400330446/2010 > BIC: FIOBCZPPXX > IBAN: CZ82 2010 0024 0033 0446 > > > On 13 Aug 2018, at 17:15, Ken Gaillot wrote: > > > > On Sat, 2018-08-11 at 17:38 +0200, FeldHost™ Admin wrote: > > >

Re: [ClusterLabs] Pacemaker confirm that node was fenced successfully

2018-08-13 Thread Ken Gaillot
labs.org > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch. > pdf > Bugs: http://bugs.clusterlabs.org -- Ken Gaillot ___

Re: [ClusterLabs] Pacemaker ordering constraints and resource failures

2018-08-08 Thread Ken Gaillot
On Wed, 2018-08-08 at 20:55 +0300, Andrei Borzenkov wrote: > 08.08.2018 16:59, Ken Gaillot пишет: > > On Wed, 2018-08-08 at 07:36 +0300, Andrei Borzenkov wrote: > > > 06.08.2018 20:07, Devin A. Bougie пишет: > > > > What is the best way to make sure pacemaker does

Re: [ClusterLabs] What am I Doing Wrong with Constraints?

2018-08-08 Thread Ken Gaillot
t01 with p_fs_clust01 (score:INFINITY) (id:colocation- > p_vip_clust01-p_fs_clust01-INFINITY) >   p_vip_clust02 with p_fs_clust02 (score:INFINITY) (id:colocation- > p_vip_clust02-p_fs_clust02-INFINITY) >   p_azip_clust02 with p_vip_clust02 (score:INFINITY) (id:colocation- > p_azip_cl

Re: [ClusterLabs] Pacemaker ordering constraints and resource failures

2018-08-08 Thread Ken Gaillot
org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratc > > h.pdf > > Bugs: http://bugs.clusterlabs.org > > > > ___ > U

Re: [ClusterLabs] Why Won't Resources Move?

2018-08-02 Thread Ken Gaillot
k in November of 2017, and you made the same > comment back then. > > https://lists.clusterlabs.org/pipermail/users/2017-November/013975.ht > ml > > And the solution turned out to be the same for me as it was for that > guy. On the node where I was getting the errors, SELINUX was >

Re: [ClusterLabs] Why Won't Resources Move?

2018-08-01 Thread Ken Gaillot
igning on to the > CIB service: Transport endpoint is not connected ] The message likely came from the resource agent calling crm_attribute to set a node attribute. That message usually means the cluster isn't running on that node, so it's highly suspect. The cib might have crashed, which should be

Re: [ClusterLabs] Fence agent executing thousands of API calls per hour

2018-08-01 Thread Ken Gaillot
gt; > > > > > > Unfortunately the logging available from vmWare doesn't give > > > > > a lot of information - it just says the number of API calls, > > > > > not which API(s) were called. > > > > > > > > > > Any ideas what might be going on

Re: [ClusterLabs] Fence agent ends up stopped with no clear reason why

2018-08-01 Thread Ken Gaillot
gt; > * vmware_fence_start_0 on q-gp2-dbpg57-1 'unknown error' (1): > > call=77, status=Error, exitreason='none', > >    last-rc-change='Mon Jul 30 21:46:30 2018', queued=1ms, > > exec=1862ms > > * vmware_fence_start_0 on q-gp2-dbpg57-3 'unknown error' (1): > > call=42, status

Re: [ClusterLabs] ban node or disable (all) resources upon node addition to the cluster - how?

2018-08-01 Thread Ken Gaillot
ster to false, then add any number of new nodes; to allow a resource on one of the new nodes, add a location constraint enabling it. Another approach would be to set resource-stickiness to INFINITY, so when you add the node, nothing moves. -- Ken Gaillot __

Re: [ClusterLabs] 2 active node with different service and 1 passive

2018-07-31 Thread Ken Gaillot
500), and R on node 2 with a -INFINITY score. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlab

Re: [ClusterLabs] ask for help for a pacemaker problem

2018-07-26 Thread Ken Gaillot
id=5361 However an unanswered question is how the loop got started. One of the nodes thought it received a shutdown request, but the other node didn't think it sent one. That is a mystery here. If you can find the "Shutdown REQ" message, the logs from both nodes around that time might she

Re: [ClusterLabs] ping Resource Agent doesnt work

2018-07-25 Thread Ken Gaillot
mailing list: Users@clusterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch. > pdf > Bugs: http://bugs.clusterlabs.org -- Ken Gaillot

Re: [ClusterLabs] ping Resource Agent doesnt work

2018-07-24 Thread Ken Gaillot
mailing list: Users@clusterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch. > pdf > Bugs: http://bugs.clusterlabs.org -- Ken Gaillot

Re: [ClusterLabs] FYI: regression using 2.0.0 / 1.1.19 Pacemaker Remote node with older cluster nodes

2018-07-17 Thread Ken Gaillot
to leave this as a known issue, and rely on the workarounds. On Mon, 2018-07-16 at 09:21 -0500, Ken Gaillot wrote: > Hi all, > > The just-released Pacemaker 2.0.0 and 1.1.19 releases have an issue > when a Pacemaker Remote node is upgraded before the cluster nodes. > > Pacemaker 2.0.0 contain

Re: [ClusterLabs] Weird Fencing Behavior

2018-07-17 Thread Ken Gaillot
rue > > > > Quorum: > >   Options: > > > > > > > > ___ > > Users mailing list: Users@clusterlabs.org > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.o

[ClusterLabs] FYI: regression using 2.0.0 / 1.1.19 Pacemaker Remote node with older cluster nodes

2018-07-16 Thread Ken Gaillot
rading any Pacemaker Remote nodes (which is the recommended practice anyway). -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting sta

Re: [ClusterLabs] Problem with pacemaker init.d script

2018-07-11 Thread Ken Gaillot
ation must be done first. So maybe the idea was to always require someone to specify run levels. But it does make more sense that they would be listed in the LSB header. One reason it wouldn't have been an issue before is some older distros use the init script's chkconfig header ins

Re: [ClusterLabs] Problem with pacemaker init.d script

2018-07-11 Thread Ken Gaillot
> > > > > > Project Home: http://www.clusterlabs.org > > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scra > > > tch.pdf > > > Bugs: http://bugs.clusterlabs.org > > > > > > > ___ > > Users m

[ClusterLabs] Pacemaker 1.1.19 released

2018-07-11 Thread Ken Gaillot
nks to all contributors of source code to this release, including Andrew Beekhof, Gao,Yan, Hideo Yamauchi, Jan Pokorný, Ken Gaillot, and Klaus Wenninger. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailma

Re: [ClusterLabs] Antw: OCF Return codes OCF_NOT_RUNNING

2018-07-11 Thread Ken Gaillot
; if I have a resource threshold set >1,  i get start->monitor->stop > > cycle > > until the threshold is consumed > > Then either your start is broken, or your monitor is broken. Try to > validate your RA using ocf-tester before using it. > > Regards, &g

Re: [ClusterLabs] What triggers fencing?

2018-07-11 Thread Ken Gaillot
ason not to do this is that if you use 0, > > > > > > > then don't use > > > > > > > anything at all (0 is default), and any other value > > > > > > > causes avoidable > > > > > > > fence delays. > > > &

Re: [ClusterLabs] Clearing failed actions

2018-07-09 Thread Ken Gaillot
> > Also, is there a way to clear one specific item from the list, or > > is clearing > > all the only option? > > pcs failcount reset [node] With the low level tools, you can use -r / --resource and/or -N / -- node with crm_resource to limit the clean-up. --

[ClusterLabs] Pacemaker 2.0.0 has been released

2018-07-06 Thread Ken Gaillot
er Explained" document has grown large enough that topics related to cluster administration have been moved to their own new document, "Pacemaker Administration": http://clusterlabs.org/pacemaker/doc/ Many thanks to all contributors of source code to this release, including Andrew Beekho

Re: [ClusterLabs] Pacemaker alert framework

2018-07-06 Thread Ken Gaillot
could even combine everything into a single custom resource agent for use as a master/slave resource, where the master is the only instance that actually runs the resource, and the slaves just act on the notifications. > > Regards, > Klaus > > > Thanks > > > > /Ian.

Re: [ClusterLabs] Cluster from scratch - 7.6. Configure the Cluster for the DRBD device

2018-07-05 Thread Ken Gaillot
ieve the note about the version shipped with CentOS 7.1 is no longer an issue with recent versions. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterl

Re: [ClusterLabs] Problem with pacemaker resources when NTP sync is done

2018-07-04 Thread Ken Gaillot
to handle large time jumps. Jumps forward aren't too bad, but jumps backward can cause significant trouble. > #  pacemakerd --version > Pacemaker 1.1.16 > Written by Andrew Beekhof > # corosync -v > Corosync Cluster Engine, version '2.4.2' > Copyright (c) 2006-2009 R

Re: [ClusterLabs] Antw: Re: Antw: Re: Resources not monitored in SLES11 SP4 (1.1.12-f47ea56)

2018-06-28 Thread Ken Gaillot
On Thu, 2018-06-28 at 09:09 +0200, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 27.06.2018 um > > > > 16:18 in Nachricht > > <1530109097.6452.1.ca...@redhat.com>: > > On Wed, 2018-06-27 at 07:41 +0200, Ulrich Windl wrote: > > > > >

Re: [ClusterLabs] Antw: Re: Antw: Re: Resources not monitored in SLES11 SP4 (1.1.12-f47ea56)

2018-06-28 Thread Ken Gaillot
On Thu, 2018-06-28 at 09:13 +0200, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 27.06.2018 um > > > > 16:32 in Nachricht > > <1530109926.6452.3.ca...@redhat.com>: > > On Wed, 2018-06-27 at 09:18 -0500, Ken Gaillot wrote: > > > On

Re: [ClusterLabs] Pacemaker not restarting Resource on same node

2018-06-28 Thread Ken Gaillot
oing anything else. Certain OCF resource agent exit codes are considered "hard" errors that prevent retrying on the same node: missing dependencies, file permission errors, etc. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org http

Re: [ClusterLabs] Install fresh pacemaker + corosync fails

2018-06-28 Thread Ken Gaillot
c features, I'd go with whatever stock packages are available for libqb and corosync. knet will be supported by corosync 3 and is bleeding-edge at the moment (though probably solid). If you do want to compile libqb and/or corosync, the guide on the wiki grabs the la

Re: [ClusterLabs] VM failure during shutdown

2018-06-27 Thread Ken Gaillot
t; storage thus from transitive rule windows_VM ---> node1 > > pcs constraint location clone_ProcDRBD_SigmaVMs prefers sgw-01 > > pcs constraint colocation add windows_VM_res with > StorageDRBD_SigmaVMs INFINITY > > pcs constraint order start StorageDRBD_SigmaVMs_rers then start &

Re: [ClusterLabs] Antw: Re: Resources not monitored in SLES11 SP4 (1.1.12-f47ea56)

2018-06-27 Thread Ken Gaillot
On Wed, 2018-06-27 at 09:18 -0500, Ken Gaillot wrote: > On Wed, 2018-06-27 at 07:41 +0200, Ulrich Windl wrote: > > > > > Ken Gaillot schrieb am 26.06.2018 um > > > > > 18:22 in Nachricht > > > > <1530030128.5202.5.ca...@redhat.com>: > >

Re: [ClusterLabs] Antw: Re: Resources not monitored in SLES11 SP4 (1.1.12-f47ea56)

2018-06-27 Thread Ken Gaillot
On Wed, 2018-06-27 at 07:41 +0200, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 26.06.2018 um > > > > 18:22 in Nachricht > > <1530030128.5202.5.ca...@redhat.com>: > > On Tue, 2018-06-26 at 10:45 +0300, Vladislav Bogdanov wrote: > > &g

[ClusterLabs] Pacemaker 2.0.0-rc6 now available

2018-06-26 Thread Ken Gaillot
release may come in handy:   https://wiki.clusterlabs.org/wiki/Pacemaker_2.0_Changes Many thanks to contributors of source code to this release, including Jan Pokorný, Klaus Wenninger, and Ken Gaillot. -- Ken Gaillot ___ Users mailing list: Users

Re: [ClusterLabs] Stop one VM, another tries to migrate

2018-06-26 Thread Ken Gaillot
f cib=1.443.2 > source=match_graph_event:310 complete=false > Jun 26 07:02:09 [4557] alpha   crmd: info: match_graph_event: >  Action Lapras_migrate_to_0 (30) confirmed on beta (rc=1) > Jun 26 07:02:09 [4557] alpha   crmd: info: > process_graph_event: Det

Re: [ClusterLabs] VM failure during shutdown

2018-06-26 Thread Ken Gaillot
resource cleanup windows_VM_res > After the above steps the VM is located on the correct node and > everything is ok. > > Is my approach correct ? > > Your opinion would be valuable, > Sincerely  > > > On 06/25/2018 07:15 PM, Ken Gaillot wrote: > > On Mon, 201

Re: [ClusterLabs] difference between external/ipmi and fence_ipmilan

2018-06-26 Thread Ken Gaillot
agents. Thus, you often see an "external/*" agent and a "fence_*" agent available for the same physical device. However, they are completely different implementations, so there may be substantive differences as well. I'm not familiar enough with these two to addres

Re: [ClusterLabs] Resources not monitored in SLES11 SP4 (1.1.12-f47ea56)

2018-06-26 Thread Ken Gaillot
gt; > > > ___ > > Users mailing list: Users@clusterlabs.org > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratc > >

Re: [ClusterLabs] VM failure during shutdown

2018-06-25 Thread Ken Gaillot
On Mon, 2018-06-25 at 09:47 -0500, Ken Gaillot wrote: > On Mon, 2018-06-25 at 11:33 +0300, Vaggelis Papastavros wrote: > > Dear friends , > > > > We have the following configuration : > > > > CentOS7 , pacemaker 0.9.152 and Corosync 2.4.0, storage with DRBD >

Re: [ClusterLabs] VM failure during shutdown

2018-06-25 Thread Ken Gaillot
ration for section  > status: OK (rc=0, origin=sgw-02/attrd/11, version=0.4704.70) > Jun 25 07:41:37 [5137] sgw-02  attrd: info:  > attrd_cib_callback:    Update 11 for last-failure- > WindowSentinelOne_res:  > OK (0) > Jun 25 07:41:37 [5137] sgw-02  attrd: info: 

[ClusterLabs] Pacemaker-1.1.19-rc1 now available

2018-06-20 Thread Ken Gaillot
ing Andrew Beekhof, Gao,Yan, Hideo Yamauchi, Jan Pokorný, and Ken Gaillot. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting sta

Re: [ClusterLabs] corosync doesn't start any resource

2018-06-20 Thread Ken Gaillot
4 > Jun 20 12:17:07 [27559] zfs-serv3   crmd:   notice: > te_rsc_command: Initiating monitor operation resIPMI- > zfs4_monitor_6 locally on zfs-serv3 | action 12 > Jun 20 12:17:08 [27559] zfs-serv3   crmd: info: > process_lrm_event:  Result of monitor

Re: [ClusterLabs] corosync doesn't start any resource

2018-06-19 Thread Ken Gaillot
he host loses power entirely, IPMI will not respond, the fencing will fail, and the cluster will be unable to recover. On-board IPMI requires a back-up method such as an intelligent power switch or sdb. > > Quorum: >   Options: > > > > thanks for help! > best regards

Re: [ClusterLabs] Fencing libvirt

2018-06-19 Thread Ken Gaillot
On Mon, 2018-06-18 at 21:01 -0400, Jason Gauthier wrote: > On Mon, Jun 18, 2018 at 11:12 AM Jason Gauthier > wrote: > > > > On Mon, Jun 18, 2018 at 10:58 AM Ken Gaillot > > wrote: > > > > > > On Mon, 2018-06-18 at 10:10 -0400, Jason Gauthier wrote: &

Re: [ClusterLabs] Fencing libvirt

2018-06-18 Thread Ken Gaillot
On Mon, 2018-06-18 at 10:10 -0400, Jason Gauthier wrote: > On Mon, Jun 18, 2018 at 9:55 AM Ken Gaillot > wrote: > > > > On Fri, 2018-06-15 at 21:39 -0400, Jason Gauthier wrote: > > > Greetings, > > > > > >    Previously, I was using fiber channe

Re: [ClusterLabs] corosync doesn't start any resource

2018-06-18 Thread Ken Gaillot
ttps://paste.debian.net/hidden/9376add7/ > > best regards > Stefan As of the end of that log file, the cluster does intend to start the resources: Jun 15 14:29:11 [5623] zfs-serv3pengine:   notice: LogActions: Start   nfs-server (zfs-serv3) Jun 15 14:29:11 [5623] zfs-

Re: [ClusterLabs] Fencing libvirt

2018-06-18 Thread Ken Gaillot
pha beta" > \ > op monitor interval=2h \ > meta target-role=Stoppedprimitive st_libvirt > stonith:external/libvirt \ > params hypervisor_uri="qemu:///system" hostlist="alpha beta" > \ > op monitor interval=2h > >

Re: [ClusterLabs] ?==?utf-8?q? Limit of concurrent ressources to start?

2018-06-13 Thread Ken Gaillot
Hi, > > additional remark: > > With some tweaks I made my cluster start two resources (i.e. IP1 and > IP2) at the same time. But it takes about 4 seconds to that the > cluster starts the next resources (i.e. IP3 and IP4). > > Did anybody see this behaviour before? > &

Re: [ClusterLabs] Pengine always trying to start the resource on the standby node.

2018-06-13 Thread Ken Gaillot
eam: Cannot write NULL to > /var/lib/pacemaker/cib/shadow.20008 >    Could not create '/var/lib/pacemaker/cib/shadow.20008': Success > > Could anyone help me how to read those messages and what's going on > my server? > > Thanks a lot.. > > > On Fri, Jun 8,

Re: [ClusterLabs] Pengine always trying to start the resource on the standby node.

2018-06-07 Thread Ken Gaillot
ra    pengine:     info: native_stop_constraints: > > > cluster_fs_stop_0 is implicit after clusterb is fenced > > > clustera    pengine:     info: native_stop_constraints: > > > cluster_vip_stop_0 is implicit after clusterb is fenced > > > clustera    pengine:   

Re: [ClusterLabs] resource agent Route active on multiple nodes

2018-06-06 Thread Ken Gaillot
ot* active. Attempt to start it failed on both > nodes. You need to investigate why it happened. Most obvious reason > would be missing "trust" table. Do you have fencing configured? The cluster will not normally attempt to recover a resource elsewhere unless the res

Re: [ClusterLabs] Resource-stickiness is not working

2018-06-05 Thread Ken Gaillot
t; > > Regards, > > > imnotarobot > >  > > Your configuration is correct, but keep in mind scores of all kinds > > will be added together to determine where the final placement is. > >  > > In this case, I'd check that you don't have any constraints with

Re: [ClusterLabs] Knowing where a resource is running

2018-06-04 Thread Ken Gaillot
status of > the resources and parse the output, but I’d prefer a cleaner and less > fragile solution.  Any suggestions? > Thanks! You're right node B won't get notifications in that case, but you can check the value of OCF_RESKEY_CRM_meta_notify_master_uname in a star

Re: [ClusterLabs] Resource-stickiness is not working

2018-06-04 Thread Ken Gaillot
s case, I'd check that you don't have any constraints with a > higher score preferring the other node. For example, if you > previously  > did a "move" or "ban" from the command line, that adds a constraint > that has to be removed manually if you no longer want it

Re: [ClusterLabs] Resource-stickiness is not working

2018-06-01 Thread Ken Gaillot
s will be added together to determine where the final placement is. In this case, I'd check that you don't have any constraints with a higher score preferring the other node. For example, if you previously did a "move" or "ban" from the command l

Re: [ClusterLabs] Expected resource-discovery=exclusive behavior

2018-06-01 Thread Ken Gaillot
ist >  Id    Name                           State > >  9     sl-gate-01               running > > > [root@n03 mmike]# LANG=C virsh list >  Id    Name                           State > -

Re: [ClusterLabs] Why would a standby node be fenced? (was: How to set up fencing/stonith)

2018-05-31 Thread Ken Gaillot
at would not be treated like error > > (causing all sorts of fatal consequences) but still evaluated for > > dependencies (i.e. dependent resources would not be started). That > > would > > be ideal for such case. I'm not clear what such a result would mean. Is the goal to s

[ClusterLabs] Pacemaker 2.0.0-rc5 now available

2018-05-31 Thread Ken Gaillot
e devices and Pacemaker Remote connection resources). * Allow a monitor to be cancelled when its resource is unmanaged. The only known issue remaining to be resolved before final release is some tweaking of the transform of pre-2.0 configurations after an upgrade. -- Ken Gaillot ___

Re: [ClusterLabs] [questionnaire] Do you manage your pacemaker configuration by hand and (if so) what reusability features do you use?

2018-05-31 Thread Ken Gaillot
; [1] https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html-si > ngle/Pacemaker_Explained/index.html#_reusing_resource_definitions > [2] https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html-si > ngle/Pacemaker_Explained/index.html#s-reusing-config-elemen

Re: [ClusterLabs] Pacemaker PostgreSQL cluster

2018-05-30 Thread Ken Gaillot
t; er_Explained/ap-upgrade.html > > What I want to do is first migrate pacemaker manually and then > automate it with some scripts. > > According to what Ken Gaillot said: > > "Rolling upgrades are always supported within the same major number > line > (i.e. 1.any

Re: [ClusterLabs] Pacemaker PostgreSQL cluster

2018-05-29 Thread Ken Gaillot
t; > Good luck anyway :) > > > > --  > > Jehan-Guillaume de Rorthais > > Dalibo > > ___ > Users mailing list: Users@clusterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: ht

Re: [ClusterLabs] Live migrate a VM in a cluster group

2018-05-29 Thread Ken Gaillot
On Tue, 2018-05-29 at 10:14 -0500, Ken Gaillot wrote: > On Sun, 2018-05-27 at 22:50 -0400, Jason Gauthier wrote: > > Greetings, > > > >  I've set up a cluster intended for VMs.  I created a VM, and have > > been pretty pleased with migrating it back and forth between

Re: [ClusterLabs] Pacemaker PostgreSQL cluster

2018-05-29 Thread Ken Gaillot
-- I'm guessing the regression fixes were already backported into those packages. > 4. Where I can find the list of (ubuntu) dependencies required to > pacemaker/corosync for 1.1.18 and 2.0.0? > > Thanks in advance for your help. > -- Ken Gaillot __

Re: [ClusterLabs] Live migrate a VM in a cluster group

2018-05-29 Thread Ken Gaillot
ou're confusing the cluster :) Any constraints starting with "cli-" were added by command-line tools doing move/ban/etc. They stay in effect until they are manually removed (the same tool will generally have a "clear" option). > location cli-prefer-p_CalibreVNC p_CalibreVNC role=

Re: [ClusterLabs] How does failure-timeout works, will the resource not be scheduled when setting too short?

2018-05-27 Thread Ken Gaillot
le > > I tested it several times, and the results were the same. Why does > the resource not be scheduled when failure-timeout setting too short? > And what does  > > it have to do with the time consuming stop of another resource?  Is > this a bug? > > My pacemaker ve

Re: [ClusterLabs] PAF not starting resource successfully after node reboot (was: How to set up fencing/stonith)

2018-05-27 Thread Ken Gaillot
2 23:57:24 [2196] d-gp2-dbpg0-2pengine:  warning: > pe_fence_node:   Node d-gp2-dbpg0-1 will be fenced because of > resource failure(s) which is why the cluster then wants to fence the node. (If a resource won't stop, the only way to recover it is to kill the entire node.) -- K

Re: [ClusterLabs] DLM fencing

2018-05-24 Thread Ken Gaillot
On Thu, 2018-05-24 at 16:14 +0200, Klaus Wenninger wrote: > On 05/24/2018 04:03 PM, Ken Gaillot wrote: > > On Thu, 2018-05-24 at 06:47 -0400, Jason Gauthier wrote: > > > On Thu, May 24, 2018 at 12:19 AM, Andrei Borzenkov <arvidjaar@gma > > > il.c > > > om&

Re: [ClusterLabs] DLM fencing

2018-05-24 Thread Ken Gaillot
> > > Users mailing list: Users@clusterlabs.org > > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > > > Project Home: http://www.clusterlabs.org > > > Getting started: http://www.clusterlabs.org/doc/Cluster_from

Re: [ClusterLabs] ethmonitor RA agent error. How can I fix this? (RHEL)

2018-05-22 Thread Ken Gaillot
after enabling eth0 of node1, error from previous procedure > still exist. > 4. Got an additional error, I have two errors now > 5. VirtualIP resource doesn't start > > > Regards, > > imnotarobot -- Ken Gaillot <kgail...@redhat.com> _

Re: [ClusterLabs] How to set up fencing/stonith

2018-05-18 Thread Ken Gaillot
> Here is the output of `pcs status` before powering off the primary: > > -- > Online: [ d-gp2-dbpg0-1 d-gp2-dbpg0-2 d-gp2-dbpg0-3 ] > > Full list of resources: > >  vfencing   (stonith:external/vcenter): Started d-gp2-dbpg0-1 >  postgresql-master-vip  

<    4   5   6   7   8   9   10   11   12   13   >