Re: [ClusterLabs] pacemaker as data store

2018-05-15 Thread Ken Gaillot
> Tel. 7-915-278-39-36 > Skype: georgemelikov > > С наилучшими пожеланиями, > Георгий Меликов, > m...@gmelikov.ru > Моб:         +7 9152783936 > Skype:     georgemelikov -- Ken Gaillot ___ Users mailing list: Users@

[ClusterLabs] Pacemaker 2.0.0-rc4 now available

2018-05-15 Thread Ken Gaillot
t and appreciated. Many thanks to all contributors of source code to this release, including Gao,Yan, Hideo Yamauchi, Jan Pokorný, and Ken Gaillot. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/lis

Re: [ClusterLabs] How to set up fencing/stonith

2018-05-17 Thread Ken Gaillot
that a fence target is never monitoring its own fence device (which would be almost pointless). There was a distant time when such a constraint was a requirement for fencing to work, but now it's just for monitoring. I'm not familiar with VMware fencing, so I can't co

Re: [ClusterLabs] How to set up fencing/stonith

2018-05-18 Thread Ken Gaillot
e="1" login_timeout=60 > power_wait=3 op monitor interval=60s > > This results in the following error: > > Error: Unable to create resource 'stonith:fence_vmware_soap', it is > not installed on this system (use --force to override) > > In the output of `

Re: [ClusterLabs] How to set up fencing/stonith

2018-05-18 Thread Ken Gaillot
ng agent for Fujitsu-Siemens RSB > fence_sanbox2 - Fence agent for QLogic SANBox2 FC switches > fence_sbd - Fence agent for sbd > fence_scsi - Fence agent for SCSI persistentl reservation > fence_tripplite_snmp - Fence agent for APC, Tripplite PDU over SNMP > fence_vbox -

Re: [ClusterLabs] How to set up fencing/stonith

2018-05-18 Thread Ken Gaillot
tgresql-master-vip symmetrical=false kind=Mandatory > pcs cluster cib-push /tmp/dbpg.xml > -- > > Here is the output of `pcs status` before powering off the primary: > > -- > Online: [ d-gp2-dbpg0-1 d-gp2-dbpg0-2 d-gp2-dbpg0-3 ] > > Full list of resources: >

Re: [ClusterLabs] ethmonitor RA agent error. How can I fix this? (RHEL)

2018-05-22 Thread Ken Gaillot
topped > 3. Even after enabling eth0 of node1, error from previous procedure > still exist. > 4. Got an additional error, I have two errors now > 5. VirtualIP resource doesn't start > > > Regards, > > imnotarobot -- Ken Gaillot

Re: [ClusterLabs] DLM fencing

2018-05-24 Thread Ken Gaillot
t; > > ___________ > > > Users mailing list: Users@clusterlabs.org > > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > > > Project Home: http://www.clusterlabs.org > > > Gettin

Re: [ClusterLabs] DLM fencing

2018-05-24 Thread Ken Gaillot
On Thu, 2018-05-24 at 16:14 +0200, Klaus Wenninger wrote: > On 05/24/2018 04:03 PM, Ken Gaillot wrote: > > On Thu, 2018-05-24 at 06:47 -0400, Jason Gauthier wrote: > > > On Thu, May 24, 2018 at 12:19 AM, Andrei Borzenkov > > il.c > > > om> wrote: > >

Re: [ClusterLabs] PAF not starting resource successfully after node reboot (was: How to set up fencing/stonith)

2018-05-27 Thread Ken Gaillot
_event:381, 0) But the stop fails too > May 22 23:57:24 [2196] d-gp2-dbpg0-2pengine:  warning: > pe_fence_node:   Node d-gp2-dbpg0-1 will be fenced because of > resource failure(s) which is why the cluster then wants to fence the node. (If a resource won't stop, the o

Re: [ClusterLabs] How does failure-timeout works, will the resource not be scheduled when setting too short?

2018-05-27 Thread Ken Gaillot
le > > I tested it several times, and the results were the same. Why does > the resource not be scheduled when failure-timeout setting too short? > And what does  > > it have to do with the time consuming stop of another resource?  Is > this a bug? > > My pacem

Re: [ClusterLabs] Live migrate a VM in a cluster group

2018-05-29 Thread Ken Gaillot
r prefers beta. You're confusing the cluster :) Any constraints starting with "cli-" were added by command-line tools doing move/ban/etc. They stay in effect until they are manually removed (the same tool will generally have a "clear" option). > location cli-prefer-p_Cali

Re: [ClusterLabs] Pacemaker PostgreSQL cluster

2018-05-29 Thread Ken Gaillot
nd best way for you to go -- I'm guessing the regression fixes were already backported into those packages. > 4. Where I can find the list of (ubuntu) dependencies required to > pacemaker/corosync for 1.1.18 and 2.0.0? > > Thanks in advance for your help. > -- Ken Gaillot

Re: [ClusterLabs] Live migrate a VM in a cluster group

2018-05-29 Thread Ken Gaillot
On Tue, 2018-05-29 at 10:14 -0500, Ken Gaillot wrote: > On Sun, 2018-05-27 at 22:50 -0400, Jason Gauthier wrote: > > Greetings, > > > >  I've set up a cluster intended for VMs.  I created a VM, and have > > been pretty pleased with migrating it back and forth bet

Re: [ClusterLabs] Pacemaker PostgreSQL cluster

2018-05-29 Thread Ken Gaillot
; > > > Good luck anyway :) > > > > --  > > Jehan-Guillaume de Rorthais > > Dalibo > > ___ > Users mailing list: Users@clusterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/users > > Projec

Re: [ClusterLabs] PAF not starting resource successfully after node reboot (was: How to set up fencing/stonith)

2018-05-29 Thread Ken Gaillot
On Tue, 2018-05-29 at 15:56 -0600, Casey & Gina wrote: > > On May 27, 2018, at 2:28 PM, Ken Gaillot > > wrote: > > > > > May 22 23:57:24 [2196] d-gp2-dbpg0-2pengine: info: > > > determine_op_status: Operation monitor found resource postgresql- >

Re: [ClusterLabs] PAF not starting resource successfully after node reboot (was: How to set up fencing/stonith)

2018-05-29 Thread Ken Gaillot
On Tue, 2018-05-29 at 13:09 -0600, Casey & Gina wrote: > > On May 27, 2018, at 2:28 PM, Ken Gaillot > > wrote: > > > > Pacemaker isn't fencing because the start failed, at least not > > directly: > > > > > May 22 23:57:24 [2196] d-gp2-d

Re: [ClusterLabs] Pacemaker PostgreSQL cluster

2018-05-30 Thread Ken Gaillot
ak > er_Explained/ap-upgrade.html > > What I want to do is first migrate pacemaker manually and then > automate it with some scripts. > > According to what Ken Gaillot said: > > "Rolling upgrades are always supported within the same major number > line > (i.e.

Re: [ClusterLabs] [questionnaire] Do you manage your pacemaker configuration by hand and (if so) what reusability features do you use?

2018-05-31 Thread Ken Gaillot
> > [1] https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html-si > ngle/Pacemaker_Explained/index.html#_reusing_resource_definitions > [2] https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html-si > ngle/Pacemaker_Explained/index.html#s-reusing-con

[ClusterLabs] Pacemaker 2.0.0-rc5 now available

2018-05-31 Thread Ken Gaillot
e devices and Pacemaker Remote connection resources). * Allow a monitor to be cancelled when its resource is unmanaged. The only known issue remaining to be resolved before final release is some tweaking of the transform of pre-2.0 configurations after an upgrade. -- Ken Gaillot ___

Re: [ClusterLabs] Why would a standby node be fenced? (was: How to set up fencing/stonith)

2018-05-31 Thread Ken Gaillot
t; resource state that would not be treated like error > > (causing all sorts of fatal consequences) but still evaluated for > > dependencies (i.e. dependent resources would not be started). That > > would > > be ideal for such case. I'm not clear what such a result wou

Re: [ClusterLabs] Expected resource-discovery=exclusive behavior

2018-06-01 Thread Ken Gaillot
=C virsh list >  Id    Name                           State > >  9     sl-gate-01               running > > > [root@n03 mmike]# LANG=C virsh list >  Id    Name                           State > ---

Re: [ClusterLabs] Resource-stickiness is not working

2018-06-01 Thread Ken Gaillot
s will be added together to determine where the final placement is. In this case, I'd check that you don't have any constraints with a higher score preferring the other node. For example, if you previously did a "move" or "ban" from the

Re: [ClusterLabs] Resource-stickiness is not working

2018-06-04 Thread Ken Gaillot
s case, I'd check that you don't have any constraints with a > higher score preferring the other node. For example, if you > previously  > did a "move" or "ban" from the command line, that adds a constraint > that has to be removed manually if you no long

Re: [ClusterLabs] Knowing where a resource is running

2018-06-04 Thread Ken Gaillot
tus of > the resources and parse the output, but I’d prefer a cleaner and less > fragile solution.  Any suggestions? > Thanks! You're right node B won't get notifications in that case, but you can check the value of OCF_RESKEY_CRM_meta_notify_master_uname

Re: [ClusterLabs] Resource-stickiness is not working

2018-06-05 Thread Ken Gaillot
t; > > Regards, > > > imnotarobot > >  > > Your configuration is correct, but keep in mind scores of all kinds > > will be added together to determine where the final placement is. > >  > > In this case, I'd check that you don't have any constra

Re: [ClusterLabs] resource agent Route active on multiple nodes

2018-06-06 Thread Ken Gaillot
> Your resource is *not* active. Attempt to start it failed on both > nodes. You need to investigate why it happened. Most obvious reason > would be missing "trust" table. Do you have fencing configured? The cluster will not normally attempt to recover a resource

Re: [ClusterLabs] Pengine always trying to start the resource on the standby node.

2018-06-07 Thread Ken Gaillot
gt; > clustera    pengine:     info: native_stop_constraints: > > > cluster_fs_stop_0 is implicit after clusterb is fenced > > > clustera    pengine:     info: native_stop_constraints: > > > cluster_vip_stop_0 is implicit after clusterb is fenced > > > clustera

Re: [ClusterLabs] Pengine always trying to start the resource on the standby node.

2018-06-13 Thread Ken Gaillot
;    error: write_xml_stream: Cannot write NULL to > /var/lib/pacemaker/cib/shadow.20008 >    Could not create '/var/lib/pacemaker/cib/shadow.20008': Success > > Could anyone help me how to read those messages and what's going on > my server? > > Thanks

Re: [ClusterLabs] ?==?utf-8?q? Limit of concurrent ressources to start?

2018-06-13 Thread Ken Gaillot
> Hi, > > additional remark: > > With some tweaks I made my cluster start two resources (i.e. IP1 and > IP2) at the same time. But it takes about 4 seconds to that the > cluster starts the next resources (i.e. IP3 and IP4). > > Did anybody see this behaviour before? &g

Re: [ClusterLabs] Fencing libvirt

2018-06-18 Thread Ken Gaillot
stlist="alpha beta" > \ > op monitor interval=2h \ > meta target-role=Stoppedprimitive st_libvirt > stonith:external/libvirt \ > params hypervisor_uri="qemu:///system" hostlist="alpha beta" > \ > op monitor interval

Re: [ClusterLabs] corosync doesn't start any resource

2018-06-18 Thread Ken Gaillot
he Log-files > > https://paste.debian.net/hidden/9376add7/ > > best regards > Stefan As of the end of that log file, the cluster does intend to start the resources: Jun 15 14:29:11 [5623] zfs-serv3pengine:   notice: LogActions: Start   nfs-server (zfs-serv3) Ju

Re: [ClusterLabs] Fencing libvirt

2018-06-18 Thread Ken Gaillot
On Mon, 2018-06-18 at 10:10 -0400, Jason Gauthier wrote: > On Mon, Jun 18, 2018 at 9:55 AM Ken Gaillot > wrote: > > > > On Fri, 2018-06-15 at 21:39 -0400, Jason Gauthier wrote: > > > Greetings, > > > > > >    Previously, I was using fiber channel w

Re: [ClusterLabs] Fencing libvirt

2018-06-19 Thread Ken Gaillot
On Mon, 2018-06-18 at 21:01 -0400, Jason Gauthier wrote: > On Mon, Jun 18, 2018 at 11:12 AM Jason Gauthier > wrote: > > > > On Mon, Jun 18, 2018 at 10:58 AM Ken Gaillot > > wrote: > > > > > > On Mon, 2018-06-18 at 10:10 -0400, Jason Gauthier wrote: &

Re: [ClusterLabs] corosync doesn't start any resource

2018-06-19 Thread Ken Gaillot
fencing method, but it has a problem if it's on- board. If the host loses power entirely, IPMI will not respond, the fencing will fail, and the cluster will be unable to recover. On-board IPMI requires a back-up method such as an intelligent power switch or sdb. > > Quorum: >   Op

Re: [ClusterLabs] corosync doesn't start any resource

2018-06-20 Thread Ken Gaillot
zfs-serv4 | action 14 > Jun 20 12:17:07 [27559] zfs-serv3   crmd:   notice: > te_rsc_command: Initiating monitor operation resIPMI- > zfs4_monitor_6 locally on zfs-serv3 | action 12 > Jun 20 12:17:08 [27559] zfs-serv3   crmd: info: > process_lrm_event:   

[ClusterLabs] Pacemaker-1.1.19-rc1 now available

2018-06-20 Thread Ken Gaillot
ncluding Andrew Beekhof, Gao,Yan, Hideo Yamauchi, Jan Pokorný, and Ken Gaillot. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Gettin

Re: [ClusterLabs] VM failure during shutdown

2018-06-25 Thread Ken Gaillot
; name="last-failure-WindowSentinelOne_res" value="1529912497"/> > Jun 25 07:41:37 [5130] sgw-02    cib: info:  > cib_process_request:    Completed cib_modify operation for section  > status: OK (rc=0, origin=sgw-02/attrd/11, version=0.4704.70) > Jun 25

Re: [ClusterLabs] VM failure during shutdown

2018-06-25 Thread Ken Gaillot
On Mon, 2018-06-25 at 09:47 -0500, Ken Gaillot wrote: > On Mon, 2018-06-25 at 11:33 +0300, Vaggelis Papastavros wrote: > > Dear friends , > > > > We have the following configuration : > > > > CentOS7 , pacemaker 0.9.152 and Corosync 2.4.0, storage with DRBD >

Re: [ClusterLabs] Resources not monitored in SLES11 SP4 (1.1.12-f47ea56)

2018-06-26 Thread Ken Gaillot
Ulrich > > > > > > ___ > > Users mailing list: Users@clusterlabs.org > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_fr

Re: [ClusterLabs] difference between external/ipmi and fence_ipmilan

2018-06-26 Thread Ken Gaillot
nts. Thus, you often see an "external/*" agent and a "fence_*" agent available for the same physical device. However, they are completely different implementations, so there may be substantive differences as well. I'm not familiar enough with these two to a

Re: [ClusterLabs] VM failure during shutdown

2018-06-26 Thread Ken Gaillot
> > pcs resource cleanup windows_VM_res > After the above steps the VM is located on the correct node and > everything is ok. > > Is my approach correct ? > > Your opinion would be valuable, > Sincerely  > > > On 06/25/2018 07:15 PM, Ken Gaillot wrote: > &

Re: [ClusterLabs] Stop one VM, another tries to migrate

2018-06-26 Thread Ken Gaillot
1): > Error > > Jun 26 07:02:09 [4557] alpha   crmd: info: > abort_transition_graph:  Transition aborted by operation > Lapras_migrate_to_0 'modify' on beta: Event failed | > magic=2:1;30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df cib=1.443.2 &g

[ClusterLabs] Pacemaker 2.0.0-rc6 now available

2018-06-26 Thread Ken Gaillot
release may come in handy:   https://wiki.clusterlabs.org/wiki/Pacemaker_2.0_Changes Many thanks to contributors of source code to this release, including Jan Pokorný, Klaus Wenninger, and Ken Gaillot. -- Ken Gaillot ___ Users mailing list: Users

Re: [ClusterLabs] Antw: Re: Resources not monitored in SLES11 SP4 (1.1.12-f47ea56)

2018-06-27 Thread Ken Gaillot
On Wed, 2018-06-27 at 07:41 +0200, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 26.06.2018 um > > > > 18:22 in Nachricht > > <1530030128.5202.5.ca...@redhat.com>: > > On Tue, 2018-06-26 at 10:45 +0300, Vladislav Bogdanov wrote: > > &g

Re: [ClusterLabs] Antw: Re: Resources not monitored in SLES11 SP4 (1.1.12-f47ea56)

2018-06-27 Thread Ken Gaillot
On Wed, 2018-06-27 at 09:18 -0500, Ken Gaillot wrote: > On Wed, 2018-06-27 at 07:41 +0200, Ulrich Windl wrote: > > > > > Ken Gaillot schrieb am 26.06.2018 um > > > > > 18:22 in Nachricht > > > > <1530030128.5202.5.ca...@redhat.com>: > >

Re: [ClusterLabs] VM failure during shutdown

2018-06-27 Thread Ken Gaillot
ode1 , windows ---> > storage thus from transitive rule windows_VM ---> node1 > > pcs constraint location clone_ProcDRBD_SigmaVMs prefers sgw-01 > > pcs constraint colocation add windows_VM_res with > StorageDRBD_SigmaVMs INFINITY > > pcs constraint order start StorageDRBD_S

Re: [ClusterLabs] Install fresh pacemaker + corosync fails

2018-06-28 Thread Ken Gaillot
looking for specific features, I'd go with whatever stock packages are available for libqb and corosync. knet will be supported by corosync 3 and is bleeding-edge at the moment (though probably solid). If you do want to compile libqb and/or corosync, the

Re: [ClusterLabs] Pacemaker not restarting Resource on same node

2018-06-28 Thread Ken Gaillot
oing anything else. Certain OCF resource agent exit codes are considered "hard" errors that prevent retrying on the same node: missing dependencies, file permission errors, etc. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org http

Re: [ClusterLabs] Antw: Re: Antw: Re: Resources not monitored in SLES11 SP4 (1.1.12-f47ea56)

2018-06-28 Thread Ken Gaillot
On Thu, 2018-06-28 at 09:13 +0200, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 27.06.2018 um > > > > 16:32 in Nachricht > > <1530109926.6452.3.ca...@redhat.com>: > > On Wed, 2018-06-27 at 09:18 -0500, Ken Gaillot wrote: > > > On

Re: [ClusterLabs] Antw: Re: Antw: Re: Resources not monitored in SLES11 SP4 (1.1.12-f47ea56)

2018-06-28 Thread Ken Gaillot
On Thu, 2018-06-28 at 09:09 +0200, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 27.06.2018 um > > > > 16:18 in Nachricht > > <1530109097.6452.1.ca...@redhat.com>: > > On Wed, 2018-06-27 at 07:41 +0200, Ulrich Windl wrote: > > > > >

Re: [ClusterLabs] Problem with pacemaker resources when NTP sync is done

2018-07-04 Thread Ken Gaillot
to handle large time jumps. Jumps forward aren't too bad, but jumps backward can cause significant trouble. > #  pacemakerd --version > Pacemaker 1.1.16 > Written by Andrew Beekhof > # corosync -v > Corosync Cluster Engine, version '2.4.2' > Copyright

Re: [ClusterLabs] Cluster from scratch - 7.6. Configure the Cluster for the DRBD device

2018-07-05 Thread Ken Gaillot
ts. I believe the note about the version shipped with CentOS 7.1 is no longer an issue with recent versions. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http

Re: [ClusterLabs] Pacemaker alert framework

2018-07-06 Thread Ken Gaillot
You could even combine everything into a single custom resource agent for use as a master/slave resource, where the master is the only instance that actually runs the resource, and the slaves just act on the notifications. > > Regards, > Klaus > > > Thanks > > > >

[ClusterLabs] Pacemaker 2.0.0 has been released

2018-07-06 Thread Ken Gaillot
aker Explained" document has grown large enough that topics related to cluster administration have been moved to their own new document, "Pacemaker Administration": http://clusterlabs.org/pacemaker/doc/ Many thanks to all contributors of source code to this release, including Andrew B

Re: [ClusterLabs] Clearing failed actions

2018-07-09 Thread Ken Gaillot
ilures. > > Also, is there a way to clear one specific item from the list, or > > is clearing > > all the only option? > > pcs failcount reset [node] With the low level tools, you can use -r / --resource and/or -N / -- node with crm_resource to limit the cle

Re: [ClusterLabs] What triggers fencing?

2018-07-11 Thread Ken Gaillot
gt; > > > > > > different, it will work > > > > > > > fine. The reason not to do this is that if you use 0, > > > > > > > then don't use > > > > > > > anything at all (0 is default), and any other value > &g

Re: [ClusterLabs] Antw: OCF Return codes OCF_NOT_RUNNING

2018-07-11 Thread Ken Gaillot
; > if I have a resource threshold set >1,  i get start->monitor->stop > > cycle > > until the threshold is consumed > > Then either your start is broken, or your monitor is broken. Try to > validate your RA using ocf-tester before using it. > > Regard

[ClusterLabs] Pacemaker 1.1.19 released

2018-07-11 Thread Ken Gaillot
nks to all contributors of source code to this release, including Andrew Beekhof, Gao,Yan, Hideo Yamauchi, Jan Pokorný, Ken Gaillot, and Klaus Wenninger. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailma

Re: [ClusterLabs] Problem with pacemaker init.d script

2018-07-11 Thread Ken Gaillot
info/users > > > > > > Project Home: http://www.clusterlabs.org > > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scra > > > tch.pdf > > > Bugs: http://bugs.clusterlabs.org > > > > > > > ___ > &g

Re: [ClusterLabs] Problem with pacemaker init.d script

2018-07-11 Thread Ken Gaillot
h configuration must be done first. So maybe the idea was to always require someone to specify run levels. But it does make more sense that they would be listed in the LSB header. One reason it wouldn't have been an issue before is some older distros use the init script's

[ClusterLabs] FYI: regression using 2.0.0 / 1.1.19 Pacemaker Remote node with older cluster nodes

2018-07-16 Thread Ken Gaillot
rading any Pacemaker Remote nodes (which is the recommended practice anyway). -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting sta

Re: [ClusterLabs] Weird Fencing Behavior

2018-07-17 Thread Ken Gaillot
enabled: true > > > > Quorum: > >   Options: > > > > > > > > ___ > > Users mailing list: Users@clusterlabs.org > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > Project Home: http://www.clu

Re: [ClusterLabs] FYI: regression using 2.0.0 / 1.1.19 Pacemaker Remote node with older cluster nodes

2018-07-17 Thread Ken Gaillot
leave this as a known issue, and rely on the workarounds. On Mon, 2018-07-16 at 09:21 -0500, Ken Gaillot wrote: > Hi all, > > The just-released Pacemaker 2.0.0 and 1.1.19 releases have an issue > when a Pacemaker Remote node is upgraded before the cluster nodes. > > Pacemaker 2.0.0 co

Re: [ClusterLabs] ping Resource Agent doesnt work

2018-07-24 Thread Ken Gaillot
mailing list: Users@clusterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch. > pdf > Bugs: http://bugs.clusterlabs.org -- Ken Gaillot

Re: [ClusterLabs] ping Resource Agent doesnt work

2018-07-25 Thread Ken Gaillot
mailing list: Users@clusterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch. > pdf > Bugs: http://bugs.clusterlabs.org -- Ken Gaillot

Re: [ClusterLabs] ask for help for a pacemaker problem

2018-07-26 Thread Ken Gaillot
g/show_bug.cgi?id=5361 However an unanswered question is how the loop got started. One of the nodes thought it received a shutdown request, but the other node didn't think it sent one. That is a mystery here. If you can find the "Shutdown REQ" message, the logs from both nodes around

Re: [ClusterLabs] 2 active node with different service and 1 passive

2018-07-31 Thread Ken Gaillot
500), and R on node 2 with a -INFINITY score. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlab

Re: [ClusterLabs] ban node or disable (all) resources upon node addition to the cluster - how?

2018-08-01 Thread Ken Gaillot
luster to false, then add any number of new nodes; to allow a resource on one of the new nodes, add a location constraint enabling it. Another approach would be to set resource-stickiness to INFINITY, so when you add the node, nothing moves. -- Ken Gaillot _

Re: [ClusterLabs] Fence agent ends up stopped with no clear reason why

2018-08-01 Thread Ken Gaillot
below.  I've found that I can > > do a `pcs resource cleanup vmware_fence` to cause it to start back > > up again in a few seconds, but why is this happening and how can I > > prevent it? > > > > vmware_fence(stonith:fence_vmware_rest):Stopped > > > &g

Re: [ClusterLabs] Fence agent executing thousands of API calls per hour

2018-08-01 Thread Ken Gaillot
> > > sustainable. > > > > > > > > > > Unfortunately the logging available from vmWare doesn't give > > > > > a lot of information - it just says the number of API calls, > > > > > not which API(s) were called. > > > > > > &g

Re: [ClusterLabs] Why Won't Resources Move?

2018-08-01 Thread Ken Gaillot
[ Error signing on to the > CIB service: Transport endpoint is not connected ] The message likely came from the resource agent calling crm_attribute to set a node attribute. That message usually means the cluster isn't running on that node, so it's highly suspect. The cib might have cras

Re: [ClusterLabs] Why Won't Resources Move?

2018-08-02 Thread Ken Gaillot
being asked back in November of 2017, and you made the same > comment back then. > > https://lists.clusterlabs.org/pipermail/users/2017-November/013975.ht > ml > > And the solution turned out to be the same for me as it was for that > guy. On the node where I was getting the erro

Re: [ClusterLabs] monitor IP address

2018-08-02 Thread Ken Gaillot
ssible ? > I would be glad if someone can tell me ;) > Best regards. > --  >      Aurélien Kempiak  > System & Network Engineer  Fixe : 03 59 82 20 05  >  125 Avenue de la République 59110 La Madeleine  > 12 rue Marivaux 75002 Paris    

Re: [ClusterLabs] Pacemaker ordering constraints and resource failures

2018-08-08 Thread Ken Gaillot
labs.org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratc > > h.pdf > > Bugs: http://bugs.clusterlabs.org > > > > ___ &

Re: [ClusterLabs] What am I Doing Wrong with Constraints?

2018-08-08 Thread Ken Gaillot
d1-INFINITY) >   p_vip_clust01 with p_fs_clust01 (score:INFINITY) (id:colocation- > p_vip_clust01-p_fs_clust01-INFINITY) >   p_vip_clust02 with p_fs_clust02 (score:INFINITY) (id:colocation- > p_vip_clust02-p_fs_clust02-INFINITY) >   p_azip_clust02 with p_vip_clust02 (score:INFINITY)

Re: [ClusterLabs] Pacemaker ordering constraints and resource failures

2018-08-08 Thread Ken Gaillot
On Wed, 2018-08-08 at 20:55 +0300, Andrei Borzenkov wrote: > 08.08.2018 16:59, Ken Gaillot пишет: > > On Wed, 2018-08-08 at 07:36 +0300, Andrei Borzenkov wrote: > > > 06.08.2018 20:07, Devin A. Bougie пишет: > > > > What is the best way to make sure pacemaker doesn’t

Re: [ClusterLabs] Pacemaker confirm that node was fenced successfully

2018-08-13 Thread Ken Gaillot
usterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch. > pdf > Bugs: http://bugs.clusterlabs.org -- Ken Gaillot __

Re: [ClusterLabs] Pacemaker confirm that node was fenced successfully

2018-08-13 Thread Ken Gaillot
Praze > > Banka: Fio banka a.s. > Číslo účtu: 2400330446/2010 > BIC: FIOBCZPPXX > IBAN: CZ82 2010 0024 0033 0446 > > > On 13 Aug 2018, at 17:15, Ken Gaillot wrote: > > > > On Sat, 2018-08-11 at 17:38 +0200, FeldHost™ Admin wrote: > > >

Re: [ClusterLabs] Spurious node loss in corosync cluster

2018-08-20 Thread Ken Gaillot
off >     } > } > service { >     name: pacemaker >     ver: 1 > } > amf { >     mode: disabled > } > > Thanks in advance for the help. > Prasad > > ___ > Users mailing list: Users@clusterlabs.org > https://lists.cl

Re: [ClusterLabs] Q: Forcing a role change of master/slave resource

2018-08-20 Thread Ken Gaillot
" using "-G" to see the current scores or "-v " to change them. The node with the highest score will be promoted. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listin

Re: [ClusterLabs] Q: ordering for a monitoring op only?

2018-08-20 Thread Ken Gaillot
e it (Years before I had > written a monitor for HP-UX' cluster that did not have this problem, > even though the configuration files were read from NFS (It's not > magic: Just periodically copy them to shared memory, and read the > config from shared memory). > >

Re: [ClusterLabs] Q: Stickyness across resource restarts

2018-08-20 Thread Ken Gaillot
years. I haven't tested the behavior with older versions to see if it's different though. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.cluste

Re: [ClusterLabs] Antw: Re: Q: ordering for a monitoring op only?

2018-08-21 Thread Ken Gaillot
On Tue, 2018-08-21 at 07:49 +0200, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 20.08.2018 um > > > > 16:49 in > > Nachricht > <1534776566.6465.5.ca...@redhat.com>: > > On Mon, 2018‑08‑20 at 10:51 +0200, Ulrich Windl wrote: > > > Hi

Re: [ClusterLabs] Different Times in the Corosync Log?

2018-08-21 Thread Ken Gaillot
state of covering a > very > short interval sequentially (i.e. no intermittent failure recovered > with > a restart of lrmd, AFAICT).  In case it can have any bearing, how do > you start pacemaker -- systemd, initscript, as a corosync plugin, > something else? -- Ken Gaillot

Re: [ClusterLabs] Antw: Re: Spurious node loss in corosync cluster

2018-08-21 Thread Ken Gaillot
specifically. Then go for 100% device utilization, then look for > network bottlenecks... > > A new corosync release cannot fix those, most likely. > > Regards, > Ulrich > > > > > In any case, for the current scenario, we did not see any > > scheduling > >

Re: [ClusterLabs] Q: automaticlly remove expired location constraints

2018-08-23 Thread Ken Gaillot
> > One problem is that the date value is not a constant, and it had to > be compared against the current date&time. > > Regards, > Ulrich crm_resource --clear -r RSC will clear all cli-* constraints -- Ken Gaillot _

Re: [ClusterLabs] Q: (SLES11 SP4) lrm_rsc_op without last-run?

2018-08-23 Thread Ken Gaillot
; > The node is not completely up-to-date, and it's using pacemaker- > 1.1.12-18.1... > > Regards, > Ulrich > > > ___ > Users mailing list: Users@clusterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/users >

Re: [ClusterLabs] Redundant ring not recovering after node is back

2018-08-24 Thread Ken Gaillot
. > > > > > > > > > > > > > > Here you have some of my configuration settings on node 1 > > > > > > > (I probed > > > > > > > already > > > > > > > to change rrp_mode): > > > > > > > > > > > > > > *- corosync.conf* > > > &

Re: [ClusterLabs] Q: Resource Groups vs Resources for stickiness and colocation?

2018-08-29 Thread Ken Gaillot
.el7.x86_64) > > /Ian This sounds like a bug. Feel free to submit a report at bugs.clusterlabs.org and attach the policy engine input file with the unexpected behavior. FYI a group's stickiness is the sum of the stickiness of each active member, though no score can be bigger t

Re: [ClusterLabs] Q: ordering clones with interleave=false

2018-08-29 Thread Ken Gaillot
ifference, whether the resource cannot run on an online > node, or is unable due to a standby or offline node? > > Regards, > Ulrich Interleave=false only applies to instances that will be started in the current transition, so offline nodes don't prevent dependent resources fro

Re: [ClusterLabs] Antw: Re: Q: ordering clones with interleave=false

2018-08-30 Thread Ken Gaillot
On Thu, 2018-08-30 at 08:28 +0200, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 29.08.2018 um > > > > 20:30 in > > Nachricht > <1535567455.5594.5.ca...@redhat.com>: > > On Wed, 2018‑08‑29 at 13:30 +0200, Ulrich Windl wrote: > > > Hi! &

Re: [ClusterLabs] Q: native_color scores for clones

2018-08-30 Thread Ken Gaillot
ding ":2" resource has scores 0, 1, and -INFINITY, and the > ":1" resource has score 1 once and -INFINITY twice. > > When I look at the "clone_solor" scores, the prm_DLM:* primitives > look as expected (no -INFINITY). However the cln_DLM clones have > sc

Re: [ClusterLabs] Pacemaker startup retries

2018-08-30 Thread Ken Gaillot
no-quorum-policy=ignore \ > default-resource-stickiness=200 \ > stonith-timeout=180s \ > last-lrm-refresh=1534489943 > > > Thanks > > César Hernández Bañó -- Ken Gaillot ___ Users mailing list:

Re: [ClusterLabs] Pacemaker startup retries

2018-08-31 Thread Ken Gaillot
either be crmd (i.e. the cluster itself) or some external program. If it's the cluster, I'd look at the "pengine:" logs on the DC before that, to see if there are any hints (node unclean, etc.). Then keep going backward until the ultimate cause is found. --

Re: [ClusterLabs] Antw: Re: Antw: Q: native_color scores for clones

2018-09-05 Thread Ken Gaillot
On Wed, 2018-09-05 at 09:32 +0200, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 04.09.2018 um > > > > 19:21 in Nachricht > > <1536081690.4387.6.ca...@redhat.com>: > > On Tue, 2018-09-04 at 11:22 +0200, Ulrich Windl wrote: > > > >

Re: [ClusterLabs] Pacemaker startup retries

2018-09-05 Thread Ken Gaillot
ection > //node_state[@uname='node2']/transient_attributes: OK (rc=0, > origin=local/crmd/73, version=0.60.30) > Filesystem(p_fs_datosweb)[962]: 2018/08/31_11:00:05 INFO: Running > start for /dev/drbd/by-res/datoswebstorage on /mnt/datosweb > Filesystem(p_fs_database)[961]: 2018/08/31_11:

Re: [ClusterLabs] Pacemaker startup retries

2018-09-05 Thread Ken Gaillot
t; > >  Oh :( I'm using Pacemaker-1.1.14. > Do you know if this reboot retries are just run 3 times? All the > tests I've done the rebooting is finished after 3 times. > > Thanks > Cesar No, if I remember correctly, it would just keep go

Re: [ClusterLabs] Pacemaker startup retries

2018-09-05 Thread Ken Gaillot
On Wed, 2018-09-05 at 09:51 -0500, Ken Gaillot wrote: > On Wed, 2018-09-05 at 16:38 +0200, Cesar Hernandez wrote: > > Hi > > > > > > > > Ah, this rings a bell. Despite having fenced the node, the > > > cluster > > > still considers the node

Re: [ClusterLabs] Pacemaker startup retries

2018-09-05 Thread Ken Gaillot
gt; Cesar If you build from source, you can apply the patch that fixes the issue to the 1.1.14 code base: https://github.com/ClusterLabs/pacemaker/commit/98457d1635db1222f93599b6021e662e766ce62d -- Ken Gaillot ___ Users mailing list: Users@clu

<    11   12   13   14   15   16   17   18   19   >