Re: [ClusterLabs] Fence node when network interface goes down

2021-11-12 Thread Ken Gaillot
that makes sense. Any help is appreciated! > > Thanks. Failure handling is configurable via the on-fail meta-attribute. You can set on-fail=fence for the ethmonitor resource's monitor action to fence the node if the monitor fails. There's also on-fail=standby, but that wi

[ClusterLabs] Pacemaker 2.1.2-rc2 now available

2021-11-16 Thread Ken Gaillot
o all contributors of source code to this release, including Chris Lumens, Ferenc Wágner, and Ken Gaillot. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] resource start after network reconnected

2021-11-19 Thread Ken Gaillot
et that to 1 minute would that cause any gross > negative > issues? It increases CPU usage and IPC traffic. For Pacemaker 2.0.3 or later, I definitely wouldn't bother. For older versions, 1 minute feels a bit much, I would go with around 5. > > Is there another setting besides c

Re: [ClusterLabs] resource start after network reconnected

2021-11-19 Thread Ken Gaillot
er-recheck-interval would explain the intermittence I > > > saw > > > this > > > morning. If I set that to 1 minute would that cause any gross > > > negative > > > issues? > > > > It increases CPU usage and IPC traffic. For Pacema

Re: [ClusterLabs] Which verson of pacemaker/corosync provides crm_feature_set 3.0.10?

2021-11-23 Thread Ken Gaillot
version can be seen at: https://wiki.clusterlabs.org/wiki/ReleaseCalendar 1.1.13 through 1.1.15 had feature set 3.0.10 > 3. Where could I get source rpms to rebuild this rpm on CentOs 8? > Thanks a lot! > _Vitaly Zolotusky The stock packages in the repos should b

Re: [ClusterLabs] Which verson of pacemaker/corosync provides crm_feature_set 3.0.10?

2021-11-23 Thread Ken Gaillot
upgrade past 1.1.15 would put you in the same situation -- if the 1.1.15 node leaves the cluster, it can't rejoin until it's upgraded to the newer version. > Thank you very much for your help! > _Vitaly > > > On November 23, 2021 5:12 PM Ken Gaillot > > wrote: &

[ClusterLabs] Pacemaker 2.1.2 final release now available

2021-11-24 Thread Ken Gaillot
due to internal cluster issues as opposed to agent issues. As usual, it also includes a number of bug fixes. Many thanks to all contributors of source code to this release, including Chris Lumens, Ferenc Wágner, Gao,Yan, Grace Chin, Hideo Yamauchi, Ken Gaillot, Klaus Wenninger, and Oyvind

[ClusterLabs] Feedback wanted: Native language support for Pacemaker help output

2021-12-03 Thread Ken Gaillot
friendly web interface for translations, but with this initial proof-of-concept, it involves github pull requests and reviews. Thoughts? -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] pcs update resource command not working

2021-12-09 Thread Ken Gaillot
[root@node01 testadmin]# rpm -qa | grep -Ei 'pcs|pacemaker|corosync' > pacemaker-2.0.2-2.el7.x86_64 > corosync-2.4.4-2.el7.x86_64 > pcs-0.9.169-1.el7.x86_64 > [root@node01 testadmin]# > > Thanks and Regards, > S Sathish S -- Ken Gaillot

Re: [ClusterLabs] VirtualDomain - started but... not really

2021-12-10 Thread Ken Gaillot
erval-0s) >migrate_to interval=0s timeout=180s > (c8kubermaster2-migrate_to-interval-0s) >monitor interval=30s > (c8kubermaster2-monitor-interval-30s) >start interval=0s timeout=90s > (c8kubermaster2-start-interval-0

[ClusterLabs] FYI: fence history display regression in Pacemaker 2.1.2

2021-12-15 Thread Ken Gaillot
m the master branch as a patch. Since this only affects the display, there are no plans for a special release. The fix will land in the next normal release, expected around the middle of 2022. -- Ken Gaillot ___ Manage your subscription:

Re: [ClusterLabs] VirtualDomain - started but... not really

2021-12-16 Thread Ken Gaillot
On Sat, 2021-12-11 at 13:49 +, lejeczek via Users wrote: > > On 10/12/2021 21:17, Ken Gaillot wrote: > > On Fri, 2021-12-10 at 16:33 +, lejeczek via Users wrote: > > > Hi guys. > > > > > > I quite often.. well, to frequently in my mind, see a VM &

Re: [ClusterLabs] VirtualDomain - unable to migrate

2022-01-04 Thread Ken Gaillot
out=90s > (c8kubermaster1-migrate_to-interval-0s) >monitor interval=30s > (c8kubermaster1-monitor-interval-30s) >start interval=0s timeout=60s > (c8kubermaster1-start-interval-0s) > stop interval=0s timeout=60s > (c8kubermaster1-st

Re: [ClusterLabs] [IMPORTANT] CI update

2022-01-04 Thread Ken Gaillot
t; CI is back online 100%. > > > > Fabio > > ___ > > Manage your subscription: > > https://lists.clusterlabs.org/mailman/listinfo/developers > > > > ClusterLabs home: https://www.clusterlabs.org/ > ___

Re: [ClusterLabs] Feedback wanted: Native language support for Pacemaker help output

2022-01-10 Thread Ken Gaillot
Re-raising this due to the recent holidays ... Is translation of Pacemaker option help and man pages something people would like to see? Would anyone be willing to contribute or proofread translations if the tools were easy? On Fri, 2021-12-03 at 15:02 -0600, Ken Gaillot wrote: > Hi all, >

Re: [ClusterLabs] Antw: [EXT] Re: Feedback wanted: Native language support for Pacemaker help output

2022-01-13 Thread Ken Gaillot
stack translated, when a more complex setup is > needed -> you will always have to search in the source/github > issues/documentation/mailing list history and rely on English. > > Best Regards, > Strahil Nikolov > > > On Tue, Jan 11, 2022 at 9:23, Ulrich Windl

Re: [ClusterLabs] Reduce failover time by concurent stop of 2 RG

2022-01-27 Thread Ken Gaillot
> both Resource Groups to improve the failover time? > > BR, > J. Gogu The only thing I can think of is on-fail=fence -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Native Chinese speaker wanted to proof a few translations

2022-01-27 Thread Ken Gaillot
riginal English, and the "msgstr" entries are the translations. You can either review them in github or reply here. Thanks to anyone who can help! -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/

Re: [ClusterLabs] Pacemaker managing Keycloak

2022-01-28 Thread Ken Gaillot
l database? > > Thanks in advance. I'd check for SELinux denials first. A command executed from the command line is unconstrained, while being executed by a daemon is subject to SELinux policies. Other than that, maybe turn on any debugging options and check the keycloak logs from the conta

Re: [ClusterLabs] Removing a resource without stopping it

2022-01-28 Thread Ken Gaillot
nfiguration, which includes the is-managed setting, Pacemaker no longer knows the resource is unmanaged. And even if you set it via resource defaults or something, eventually you have to set it back, at which point Pacemaker will still have the same response. -- Ken Gaillot __

Re: [ClusterLabs] Removing a resource without stopping it

2022-01-31 Thread Ken Gaillot
tion, but hopefully > it > at least clarified a bit what's going on. > > > Regards, > Tomas > > > Dne 29. 01. 22 v 6:12 Digimer napsal(a): > > On 2022-01-29 00:10, Digimer wrote: > > > On 2022-01-28 16:54, Ken Gaillot wrote: >

Re: [ClusterLabs] Is there a python package for pacemaker ?

2022-02-02 Thread Ken Gaillot
t to use the subprocess module to execute the Pacemaker command-line tools to do what you want. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] Re: what is the "best" way to completely shutdown a two-node cluster ?

2022-02-11 Thread Ken Gaillot
resting enhancement: > Like the utilization counting "static" resource consumption, one > could have a > dynamic resource consumption (counting semaphore-like) that is > consumed while > an operation on an instance naming that resource is being performed. > So when you name your resource "concurrent_vm_ops" and asign that to > every vm > configuration, eventually initalizing the resource to siome thing > like 2 or 3, > then you could limit the concurrent VM invocations. Likewise, for > less heave > instances you could use more relaxed settings or no restrictions at > all... > > Regards, > Ulrich > You can accomplish something similar with an ordering constraint with kind=Serialize. In the case of "start vm1 then start vm2" with kind=Serialize, it means that vm1 and vm2 will not be started simultaneously, but neither actually requires the other or has to be done in a specific order. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] crm resource stop VirtualDomain - but VirtualDomain shutdown start some minutes later

2022-02-15 Thread Ken Gaillot
ul shutdown request for domain vm_mouseidgenes. > > Any idea ? > What is about that transition 128, which is aborted ? A transition is the set of actions that need to be taken in response to current conditions. A transition is aborted any time conditions change (here, the target-role being changed in the configuration), so that a new set of actions can be calculated. Someone once defined a transition as an "action plan", and I'm tempted to use that instead. Plus maybe replace "aborted" with "interrupted", so then we'd have "Action plan interrupted" which is maybe a little more understandable. > > Transition 128 is finished: > Feb 15 21:04:26 [15370] ha-idg-2 crmd: notice: > run_graph: Transition 128 (Complete=1, Pending=0, Fired=0, > Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input- > 3548.bz2): Complete > > And one second later the shutdown starts. Is that normal that there > is such a big time gap ? > > Bernd No, there should be another transition calculated (with a "saving input" message) immediately after the original transition is aborted. What's the timestamp on that? -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Xen, SLES15, libvirt live-migration and a fencing loop

2022-02-16 Thread Ken Gaillot
quot;Xen hypervisor" boot entry, > and thus booted the non-Xen kernel. > But still, when booting correctly, the cluster would still try to > "recover" from the false "is active on 2 nodes", so the true fix was > a manual

Re: [ClusterLabs] crm resource stop VirtualDomain - but VirtualDomain shutdown start some minutes later

2022-02-16 Thread Ken Gaillot
oss multiple transitions. Often when some event is happening, lots of micro-conditions (action results, node attribute changes, etc.) change in a short time, and you'll see a new transition after each one. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] crm resource stop VirtualDomain - but VirtualDomain shutdown start some minutes later

2022-02-16 Thread Ken Gaillot
> > Why is there sometimes "complete=true" and sometimes "complete=false" > ? > What does that mean ? > > Bernd "Complete" is whether all actions originally planned in the transition were completed. For complete=true, the log is basically just a heads-up that the cluster needs to recheck things, since there's nothing to actually abort. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] crm resource stop VirtualDomain - but VirtualDomain shutdown start some minutes later

2022-02-17 Thread Ken Gaillot
any new actions from the transition -- but any actions currently in flight must complete before the new transition can be calculated. Changes that abort a transition include configuration changes, a node joining or leaving, an unexpected action result being received, a node attribute changing, the clus

Re: [ClusterLabs] crm resource stop VirtualDomain - but VirtualDomain shutdown start some minutes later

2022-02-18 Thread Ken Gaillot
h before starting the stop of the other > resources. > Cluster tried to "abort" the shutdown, but shutdown can't be aborted. > And i had bad luck that the shutdown of this domain took so long. > > Correct ? > > Bernd > Yes, other than the cluster isn't trying to abort the shutdown, it's just discarding any actions that were planned after it in the same transition. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Noticed oddity when DC is going to be fenced

2022-03-01 Thread Ken Gaillot
emaker-controld[6980]: notice: State > transition S_TRANSITION_ENGINE -> S_IDLE > > (pacemaker-2.0.5+20201202.ba59be712-150300.4.16.1.x86_64) > > Did I misunderstand something, or does it look like a bug? > > Regards, > Ulrich > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] Re: Noticed oddity when DC is going to be fenced

2022-03-02 Thread Ken Gaillot
On Wed, 2022-03-02 at 08:41 +0100, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 01.03.2022 um > > > > 16:04 in > Nachricht > <463458e414f7c411eb1107335be6ee9a6e2d13ee.ca...@redhat.com>: > > On Tue, 2022‑03‑01 at 10:05 +0100, Ulrich Windl wrote: >

Re: [ClusterLabs] Antw: Re: Antw: [EXT] Re: Noticed oddity when DC is going to be fenced

2022-03-04 Thread Ken Gaillot
On Fri, 2022-03-04 at 08:17 +0100, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 02.03.2022 um > > > > 16:10 in > Nachricht > : > > On Wed, 2022-03-02 at 08:41 +0100, Ulrich Windl wrote: > > > > > > Ken Gaillot schrieb

Re: [ClusterLabs] Pacemaker API (REST, SOAP, Java library)?

2022-03-08 Thread Ken Gaillot
and crm_mon will be added with the next release. In the meantime, most people just execute the command-line tools directly from their code. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF with postgresql 13?

2022-03-08 Thread Ken Gaillot
; the > Pacemaker point of view, but I'm not sure how they can track the > dependency > (ping Ken?). Higher-level tools like pcs or crm shell could probably do it when removing the resource (i.e. if the resource was a promotable clone, check for and remove any node attributes o

[ClusterLabs] Coming in Pacemaker 2.1.3: CIB colorization for ACLs

2022-03-09 Thread Ken Gaillot
show the parts of the CIB that the specified user can't see. This feature was initially developed by Jan Pokorný and completed by Grace Chin. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/

Re: [ClusterLabs] Parsing the output of crm_mon

2022-03-18 Thread Ken Gaillot
t; > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] Re: Parsing the output of crm_mon

2022-03-21 Thread Ken Gaillot
On Mon, 2022-03-21 at 08:27 +0100, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 18.03.2022 um > > > > 13:39 in > Nachricht > : > > On Fri, 2022‑03‑18 at 08:46 +0100, Ulrich Windl wrote: > > > Hi! > > > > > > Parsing the ou

Re: [ClusterLabs] Resources too_active (active on all nodes of the cluster, instead of only 1 node)

2022-03-24 Thread Ken Gaillot
have very serious impact if such a case can re-occur inspite > of stonith already configured. Hence the ask . > In case this situation gets reproduced, how can it be handled? > > Note: We have stonith configured and it has been working fine so far. > In this case also, the initial fencing happened from stonith only. > > Thanks in advance! -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Goodbye crm_report?

2022-03-24 Thread Ken Gaillot
nt to keeping crm_report around? :-) It would remain available for a long transition period to give time for the updated sosreport plugins to make their way into distros and for higher-level tools and user scripts to be updated. -- Ken Gaillot ___ Manage

Re: [ClusterLabs] Order constraint with a timeout?

2022-03-28 Thread Ken Gaillot
to > start. There > is no timeout option. > > Best regards, > -John > How do you envision the timeout working? You can add a timeout for the ordering itself using rules, where the ordering no longer applies after a certain date/time, but it doesn't sound like that's

Re: [ClusterLabs] Order constraint with a timeout?

2022-03-28 Thread Ken Gaillot
diately start the second resource if the first failed to > > > start. There > > > is no timeout option. > > > > > > Best regards, > > > -John > > > > > > > How do you envision the timeout working? > > > > You can add a t

Re: [ClusterLabs] Q: using rsc_defaults (crm shell syntax)

2022-03-30 Thread Ken Gaillot
labs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#resource-expressions -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Unable to communicate with z2-server-nat2 and Unable to synchronize and save tokens on nodes

2022-04-05 Thread Ken Gaillot
; ... > x.x.x.3 z2-server-nat2 > x.x.x.2 z2-server-nat1 > ... > ... > > --- > - > > I've also made sure the service is up: > > [user1@z2-server-nat2 ~]$ systemctl status pcsd.service > ● pcsd.service - PCS GUI and remote configuration interface >Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; > vendor preset: disabled) >Active: active (running) since Tue 2022-04-05 04:29:16 GMT; 3h > 24min ago > Docs: man:pcsd(8) >man:pcs(8) > Main PID: 856 (pcsd) >Memory: 28.6M >CGroup: /system.slice/pcsd.service >└─856 /usr/bin/ruby /usr/lib/pcsd/pcsd > > Apr 05 04:29:16 z2-server-nat2 systemd[1]: Starting PCS GUI and > remote configuration interface... > Apr 05 04:29:16 z2-server-nat2 systemd[1]: Started PCS GUI and remote > configuration interface. > > --- > - > > Am I missing something in making the nodes able to communicate with > each other? How do I proceed from here? > > Regards, > Chariot > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Pacemaker / ubuntu doesn't see my sbd device: what am I missing?

2022-04-07 Thread Ken Gaillot
uration. > Apr 6 14:40:46 ubuntuserver pacemaker-fenced[349712]: notice: > Operation 'monitor' [349931] for device 'fence-sbd' returned: -61 (No > data available) > Apr 6 14:40:46 ubuntuserver pacemaker-fenced[349712]: warning: fence- > sbd:349931 [ Performing:

Re: [ClusterLabs] SAP HANA monitor fails - Error performing operation: No such device or address

2022-04-08 Thread Ken Gaillot
> meta clone-node-max=1 target-role=Started interleave=true > colocation col_saphana_ip_HPN_HDB00 4000: g_ip_HPN_HDB00:Started > msl_SAPHana_HPN_HDB00:Master > order ord_SAPHana_HPN_HDB00 Optional: cln_SAPHanaTopology_HPN_HDB00 > msl_SAPHana_HPN_HDB00 > property cib-bootstrap-options: \ > la

[ClusterLabs] Coming in Pacemaker 2.1.3: multiple-active=stop_unexpected

2022-04-08 Thread Ken Gaillot
e, those other resources will still need to be fully restarted. This is because any ordering constraint "start A then start B" implies "stop B then stop A", so we can't stop the wrongly active instances of A until B is stopped. -- Ken Gaillot ___

Re: [ClusterLabs] Antw: [EXT] Re: Coming in Pacemaker 2.1.3: multiple‑active=stop_unexpected

2022-04-11 Thread Ken Gaillot
On Mon, 2022-04-11 at 08:20 +0200, Ulrich Windl wrote: > > > > Andrei Borzenkov schrieb am 09.04.2022 um > > > > 06:48 in > Nachricht <30178b34-d2fd-1af4-58ed-d9d2aa6e6...@gmail.com>: > > On 08.04.2022 20:16, Ken Gaillot wrote: > > > Hi all, &

[ClusterLabs] Coming in 2.1.3: node health monitoring improvements

2022-04-12 Thread Ken Gaillot
unning any resources, but not know why, unless you thought to check every node health attribute. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] Coming in 2.1.3: node health monitoring improvements

2022-04-13 Thread Ken Gaillot
On Wed, 2022-04-13 at 08:22 +0200, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 12.04.2022 um > > > > 17:22 in > Nachricht > <33f4147d0f6a3e46581aaa46a4eca81dfa59ce15.ca...@redhat.com>: > > Hi all, > > > > I'm hoping to hav

Re: [ClusterLabs] Can a two node cluster start with only one node?

2022-04-20 Thread Ken Gaillot
ly it will require manual intervention again to get going. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Can a two node cluster start resources if only one node is booted?

2022-04-20 Thread Ken Gaillot
stVote: No QdeviceMasterWins: > No > > Is there something specific I should look for in the log? > > So can a two node cluster work after booting only one node? Maybe it > never will and I am wasting a lot of time, yours and mine. > > If it can, what else can I investigate further? > > Best regards, > John > What does crm_mon show when the node is up by itself? -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Pacemaker 2.1.3-rc1 now available

2022-04-21 Thread Ken Gaillot
ng Chris Lumens, Chrissie Caulfield, Gao,Yan, Grace Chin, Hideo Yamauchi, Jan Friesse, Jan Pokorný, Ken Gaillot, Klaus Wenninger, Liang,Xin, Reid Wahl, Tomas Jelinek, and Wangluwei. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/m

Re: [ClusterLabs] OCF_TIMEOUT - Does it recover by itself?

2022-04-26 Thread Ken Gaillot
> * fence-server02(stonith:fence_vmware_rest): Started > server01 > ... > > Is "pcs resource cleanup" the right way to remove those messages ? > > > > > Atenciosamente/Kind regards, > Salatiel -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] Re: OCF_TIMEOUT ‑ Does it recover by itself?

2022-04-27 Thread Ken Gaillot
On Wed, 2022-04-27 at 08:49 +0200, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 26.04.2022 um > > > > 21:24 in > Nachricht > : > > On Tue, 2022‑04‑26 at 15:20 ‑0300, Salatiel Filho wrote: > > > I have a question about OCF_TIMEOUT. Some time

Re: [ClusterLabs] How many nodes redhat cluster does supports

2022-04-27 Thread Ken Gaillot
s their own limits based on practicality -- often 16 or 32 full cluster nodes (more are possible with Pacemaker Remote). -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Help understanding recover of promotable resource after a "pcs cluster stop --all"

2022-05-02 Thread Ken Gaillot
lt behavior in that situation. There must be something else in the configuration that is preventing promotion. The DRBD resource agent should set a promotion score for the node. You can run "crm_mon -1A" to show all node attributes; there should be one like "master-DRBDData" for t

Re: [ClusterLabs] Help understanding recover of promotable resource after a "pcs cluster stop --all"

2022-05-02 Thread Ken Gaillot
o set it. I'm not familiar enough with that agent to know why it might not. > > > > Atenciosamente/Kind regards, > Salatiel > > On Mon, May 2, 2022 at 12:26 PM Ken Gaillot > wrote: > > On Mon, 2022-05-02 at 09:58 -0300, Salatiel Filho wrote: > > > Hi, I am trying to

[ClusterLabs] Pacemaker 2.1.3-rc2 now available

2022-05-18 Thread Ken Gaillot
, Ken Gaillot, and Reid Wahl. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Cluster unable to find back together

2022-05-19 Thread Ken Gaillot
> > > > [https://go.aciworldwide.com/rs/030-ROK-804/images/aci-footer.jpg > > ] <http://www.aciworldwide.com> > > This email message and any attachments may contain confidential, > > proprietary or non-public information. The information is intended > > solely for

Re: [ClusterLabs] What/how to clean up when bootstrapping new cluster (or: I have a phantom node)

2022-05-24 Thread Ken Gaillot
tive resources > > > What is the cleanup step (or steps) that I'm missing? Or are there so > many details that it's best to leave this to pcs/crmsh? crm_node --remove node1 or just don't start pacemaker until corosync is correct. pcs/crmsh are definitely much easier to use (especially as the number of nodes grows) but if you're looking to learn low-level details, there's nothing wrong with that. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] No node name in corosync-cmapctl output

2022-05-31 Thread Ken Gaillot
32) = 2 > nodelist.node.1.ring0_addr (str) = k2 > nodelist.node.2.nodeid (u32) = 3 > nodelist.node.2.ring0_addr (str) = k3 > > Why not also use "uname -n" when "name" is not explicitly set in the > corosync nodelist config? > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Pacemaker 2.1.3 final release now available

2022-06-01 Thread Ken Gaillot
colorized for a user's ACLs. Many thanks to all contributors of source code to this release, including Chris Lumens, Chrissie Caulfield, Gao,Yan, Grace Chin, Hideo Yamauchi, Jan Friesse, Jan Pokorný, Ken Gaillot, Klaus Wenninger, Liang,Xin, Reid Wahl, Tomas Jelinek, and Wangluwei. -- Ken Ga

[ClusterLabs] Pacemaker 2.1.3 release has regression, 2.1.4 coming soon

2022-06-03 Thread Ken Gaillot
is why it wasn't caught before release. A 2.1.4 release with the fix should be available next week. In the meantime, 2.1.3 is perfectly fine for clusters that don't use target-attribute. -- Ken Gaillot ___ Manage your subscript

[ClusterLabs] Pacemaker 2.1.4-rc1 now available

2022-06-03 Thread Ken Gaillot
ck is important and appreciated. Many thanks to all contributors of source code to this release, including Chris Lumens, Ken Gaillot, Petr Pavlu, and Reid Wahl. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/

Re: [ClusterLabs] Required guidance w.r.t pacemaker

2022-06-08 Thread Ken Gaillot
et me know whether the above scenario can be handled, any > links, examples would be of great help. > > Attaching a picture that depicts the scenario. > > Please do the needful, Thank you > > Regards > Sridhar -- Ken Gaillot _

Re: [ClusterLabs] Required guidance w.r.t pacemaker

2022-06-08 Thread Ken Gaillot
ndles-containerized-resources > > Regards > Sridhar > > > On Wed, 8 Jun 2022 at 19:46, Andrei Borzenkov > wrote: > > On 08.06.2022 17:01, Ken Gaillot wrote: > > > On Wed, 2022-06-08 at 18:31 +0530, Sridhar K wrote: > > >> Hi Team, > > >&g

Re: [ClusterLabs] crm status shows CURRENT DC as None

2022-06-14 Thread Ken Gaillot
re any impact on cluster functionality? > Thanks > Priyanka > It is fine for the DC to be NONE briefly, but if it lasts more than a few seconds, something's wrong. The logs should have more details. The cluster is unable to manage resources or fence nodes when there is no DC. Effectiv

Re: [ClusterLabs] Why not retry a monitor (pacemaker-execd) that got a segmentation fault?

2022-06-14 Thread Ken Gaillot
7;error' > ... > Jun 14 14:09:16 h19 pacemaker-schedulerd[7442]: notice: * > Recoverprm_xen_v04 ( h19 ) > > Regards, > ulrich > > > > ___ > Manage your

Re: [ClusterLabs] Antw: [EXT] Re: Why not retry a monitor (pacemaker‑execd) that got a segmentation fault?

2022-06-14 Thread Ken Gaillot
On Tue, 2022-06-14 at 15:53 +0200, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 14.06.2022 um > > > > 15:49 in > Nachricht > : > > On Tue, 2022‑06‑14 at 14:36 +0200, Ulrich Windl wrote: > > > Hi! > > > > > > I had a cas

[ClusterLabs] Pacemaker 2.1.4 final release now available

2022-06-15 Thread Ken Gaillot
Lumens, Ken Gaillot, Petr Pavlu, and Reid Wahl. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] related to fencing in general , docker containers

2022-06-17 Thread Ken Gaillot
uster nodes, and just want to run resources inside containers, then bundles are your best bet: https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#bundles-containerized-resources -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] pacemaker-fenced[11637]: warning: Can't create a sane reply

2022-06-22 Thread Ken Gaillot
r subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] modified RA can't be used

2022-06-27 Thread Ken Gaillot
data section to be the > > same as the filename. > > > > > > Oyvind > > > > OMG. Thank you !!! > > Bernd -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] FYI: one more regression introduced in Pacemaker 2.1.3

2022-06-27 Thread Ken Gaillot
urces are advised to wait until the fix is released (expected in 2.1.5 at the end of this year) or ensure that their OS packages include the fix if using 2.1.3 or 2.1.4. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/ma

Re: [ClusterLabs] FYI: one more regression introduced in Pacemaker 2.1.3

2022-06-28 Thread Ken Gaillot
Quick update: I believe only the redis and rabbitmq agents were affected, so most users don't have to care about this issue. On Mon, 2022-06-27 at 16:07 -0500, Ken Gaillot wrote: > Hi all, > > Another regression was found that was introduced in Pacemaker 2.1.3. > > As part o

Re: [ClusterLabs] is there a way to cancel a running live migration or a "resource stop" ?

2022-07-07 Thread Ken Gaillot
. Live migration is a multi-step process, so it is possible for the process to get interrupted in the middle, but in that case the resource will likely be restarted. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/list

Re: [ClusterLabs] Fencing for quorum device?

2022-07-18 Thread Ken Gaillot
ave fencing for a quorum device? I > have 2 node cluster with one quorum device. Both 2 nodes have fencing > agents. > > But I wonder that should i define the fencing agent for quorum device > or not? Just in case it is laggy... > > Thank you

Re: [ClusterLabs] Q: About a false negative of storage_mon

2022-08-02 Thread Ken Gaillot
unori INOUE I agree, it makes sense to use O_DIRECT when available. I don't think an option is necessary. However, O_DIRECT is not available on all OSes, so the configure script should detect support. Also, it is not supported by all filesystems, so if the open fails, we should retry without

Re: [ClusterLabs] cluster log not unambiguous about state of VirtualDomains

2022-08-03 Thread Ken Gaillot
(ocf::lentes:VirtualDomain):Started ha-idg-1 <=== > Aug 03 00:14:04 [19367] ha-idg-1pengine: info: > common_print:vm- > photoshop(ocf::lentes:VirtualDomain):Started ha-idg-1 > Aug 03 00:14:04 [19367] ha-idg-1pengine: info: > common_print:vm-check- &

Re: [ClusterLabs] 2-Node Cluster - fencing with just one node running ?

2022-08-08 Thread Ken Gaillot
gt; > > NOT be > > > usable for fencing ha-idg-1. > > > > > > fence_ilo_ha-idg-1 should be configured with pcmk_host_list=ha- > > > idg-1, > > > and fence_ilo_ha-idg-2 should be configured with > > > pcmk_host_list=ha-idg-2. > > &g

Re: [ClusterLabs] node1 and node2 communication time question

2022-08-09 Thread Ken Gaillot
heduler, invoke the resource agent, and record the result if changed. When resource loss is detected, the stop/start time of the resource is the main factor. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] node1 and node2 communication time question

2022-08-10 Thread Ken Gaillot
o unrecoverable situations and data loss. If your cluster nodes are virtual machines, and you have access to the host, this should work: https://wiki.clusterlabs.org/wiki/Guest_Fencing If you're using something else as cluster nodes, let us know. -- Ken Gaillot

Re: [ClusterLabs] Ordering - clones & relocation

2022-09-01 Thread Ken Gaillot
g. With an optional ordering, the dependent resource will get started after the primary resource *if* they both need to be started, but if only one needs to be started, the other won't be affected. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] DC marks itself as OFFLINE, continues orchestrating the other nodes

2022-09-08 Thread Ken Gaillot
better. We can't just override the join state if the other nodes think it is different, but we could release DC and restart the join process. How did it handle the situation in this case? > > Thanks, > Lars -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Coming in Pacemaker 2.1.5: ACL enhancements

2022-09-19 Thread Ken Gaillot
accept an optional "name" attribute to use instead of the XML ID. If no name is specified, it will continue to use the XML ID, maintaining backward compatibility. The release will also have a few other small features and a bunch of bug fixes, including multiple regression fi

Re: [ClusterLabs] DC marks itself as OFFLINE, continues orchestrating the other nodes

2022-09-29 Thread Ken Gaillot
at 10:11:46AM -0500, Ken Gaillot wrote: > > On Thu, 2022-09-08 at 15:01 +0200, Lars Ellenberg wrote: > > > Scenario: > > > three nodes, no fencing (I know) > > > break network, isolating nodes > > > unbreak network, see how cluster partitions rejoin and res

Re: [ClusterLabs] Pacemaker question

2022-10-04 Thread Ken Gaillot
_ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] trace of resource - sometimes restart, sometimes not

2022-10-06 Thread Ken Gaillot
ind more information about > DLM, because it is a mystery for me. > Sometimes the DLM does not respond to the "monitor", so it needs to > be restarted, and therefore all depending resources (which is a lot). > This happens under some load (although not completely overwhelme

Re: [ClusterLabs] crm resource trace

2022-10-17 Thread Ken Gaillot
l me those two files, I can try to figure out what happened. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] crm resource trace

2022-10-17 Thread Ken Gaillot
3 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-sim (Started ha-idg-1) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-geneious (Started ha-idg-1) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-idcc-devel (Started ha-idg-1) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-genetrap (Started ha-idg-1) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-mouseidgenes (Started ha-idg-1) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-greensql (Started ha-idg-1) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-severin (Started ha-idg-1) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave ping_19216810010(Stopped) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave ping_19216810020(Stopped) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm_crispor (Stopped unmanaged) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-dietrich (Started ha-idg-1) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-pathway (Started ha-idg-1) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-crispor-server (Started ha-idg-1) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-geneious-license (Started ha-idg-1) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-nc-mcd (Started ha-idg-1) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-amok (Started ha-idg-1) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-geneious-license-mcd (Started ha-idg-1) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-documents-oo (Started ha-idg-1) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave fs_test_ocfs2 (Started ha-idg-2) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-ssh (Started ha-idg-1) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm_snipanalysis (Stopped unmanaged) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-seneca (Started ha-idg-1) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-photoshop(Started ha-idg-1) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-check-mk (Started ha-idg-1) > Oct 14 19:26:33 [26000] ha-idg-1pengine: info: > LogActions: Leave vm-encore (Started ha-idg-1) > > no restart !!! > > There is only one difference i see is the section i marked with "-- > ". > But i don't understand why this is different. > > Bernd > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] crm resource trace

2022-10-18 Thread Ken Gaillot
On Tue, 2022-10-18 at 20:48 +0200, Lentes, Bernd wrote: > - On 17 Oct, 2022, at 21:41, Ken Gaillot kgail...@redhat.com > wrote: > > > This turned out to be interesting. > > > > In the first case, the resource history contains a start action and > > a > &g

[ClusterLabs] Pacemaker-2.1.5-rc1 now available

2022-10-24 Thread Ken Gaillot
ks to all contributors of source code to this release, including bin-ly, Chris Lumens, Christine Caulfield, Ferenc Wágner, Gao,Yan, Grace Chin, Hideo Yamauchi, Jan Pokorný, Ken Gaillot, Klaus Wenninger, lihaipeng, luckhuanhuan, Petr Pavlu, Reid Wahl, Taketo Kabe, wangluwei, and wangmeng. -- Ken Ga

Re: [ClusterLabs] crm resource trace

2022-10-24 Thread Ken Gaillot
sers < [ > > > mailto:users@clusterlabs.org | users@clusterlabs.org ] > wrote: > > > > > > > > > Did you try a cleanup in between? > > > > When i do a cleanup before trace/untrace the resource is not > > restarted. > > When i don't do a cleanup it is restarted. > > > > Bernd -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] crm resource trace

2022-10-24 Thread Ken Gaillot
On Fri, 2022-10-21 at 13:05 +0200, Lentes, Bernd wrote: > - On 17 Oct, 2022, at 21:41, Ken Gaillot kgail...@redhat.com > wrote: > > > This turned out to be interesting. > > > > In the first case, the resource history contains a start action and > > a > &g

[ClusterLabs] FYI: clusterlabs.org server maintenance window this weekend

2022-11-01 Thread Ken Gaillot
Hi everybody, Just FYI, the clusterlabs.org server (including the websites and mailing lists) will be taken down for planned maintenance this weekend. Most likely it will just be a few hours on Saturday, but if there are complications it could be longer. -- Ken Gaillot

Re: [ClusterLabs] VirtualDomain did not stop although "crm resource stop"

2022-11-02 Thread Ken Gaillot
because of the running > Live-Migration and would have start the shutdown when the Live- > Migration is finished ? > > Bernd > Yep. It's not specific to migration -- any actions already initiated have to finish before the cluster will do a

Re: [ClusterLabs] Fwd: corosync works but pacemaker is started and both processes exit

2022-11-02 Thread Ken Gaillot
nodeid: 2 > # Address of first link > ring0_addr: node-2 > # When knet transport is used it's possible to define up to 8 > links > ring1_addr: 60.60.60.119 > } > # ... > service { > var: 0 > name: pacemaker > } > } > > > > > Attached is the log in debug mode > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Second (and possibly final) release candidate for Pacemaker 2.1.5 now available

2022-11-15 Thread Ken Gaillot
the new release. We do many regression tests and simulations, but we can't cover all possible use cases, so your feedback is important and appreciated. Many thanks to all contributors of source code to this release, including Chris Lumens, Gao,Yan, and Ken Gaillot. -- Ken Ga

Re: [ClusterLabs] Unable to build rpm using make rpm command for pacemaker-2.1.4.

2022-11-21 Thread Ken Gaillot
e a tar archive > > gzip: stdin: unexpected end of file > /usr/bin/tar: Child returned status 1 > /usr/bin/tar: Error is not recoverable: exiting now > error: Bad exit status from /var/tmp/rpm-tmp.fb1j8n (%prep) > > > RPM build errors: > File /root/smf_sourc

[ClusterLabs] Third (and possibly final) release candidate for Pacemaker 2.1.5 now available

2022-11-22 Thread Ken Gaillot
ssion tests and simulations, but we can't cover all possible use cases, so your feedback is important and appreciated. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

<    13   14   15   16   17   18   19   20   >