[ClusterLabs] Antw: [EXT] [Problem] crm_attirbute fails to expand run options.

2023-03-07 Thread Ulrich Windl
>>> schrieb am 07.03.2023 um 08:41 in Nachricht <215883790.380746.1678174910897.javamail.ya...@mail.yahoo.co.jp>: > Hi All, > > The crm_attribute command expands the contents of options from the > OCF_RESOURCE_INSTANCE environment variable if the p option is not specified. > > However, if

[ClusterLabs] Antw: [EXT] Release crmsh 4.4.1

2023-03-03 Thread Ulrich Windl
>>> Xin Liang via Users schrieb am 03.03.2023 um 08:00 >>> in Nachricht > Hello everyone! > > I'm happy to announce the release crmsh 4.4.1 now is available! > > Changes since tag 4.4.0 > > Features: > > * Enable "crm configure show related:" to show the objects by > given ra type

[ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] resource cloned group colocations

2023-03-02 Thread Ulrich Windl
>>> Gerald Vogt schrieb am 02.03.2023 um 17:27 in Nachricht : > On 02.03.23 14:51, Ulrich Windl wrote: >>>>> Gerald Vogt schrieb am 02.03.2023 um 14:43 in Nachricht >> <9ba5cd78-7b3d-32ef-38cf-5c5632c46...@spamcop.net>: >>> On 02.03.23 14:30, Ulr

[ClusterLabs] Antw: Re: Antw: [EXT] resource cloned group colocations

2023-03-02 Thread Ulrich Windl
>>> Gerald Vogt schrieb am 02.03.2023 um 14:43 in Nachricht <9ba5cd78-7b3d-32ef-38cf-5c5632c46...@spamcop.net>: > On 02.03.23 14:30, Ulrich Windl wrote: >>>>> Gerald Vogt schrieb am 02.03.2023 um 08:41 in Nachricht >> <624d0b70-5983-4d21-6777-55be9

[ClusterLabs] Antw: [EXT] resource cloned group colocations

2023-03-02 Thread Ulrich Windl
>>> Gerald Vogt schrieb am 02.03.2023 um 08:41 in Nachricht <624d0b70-5983-4d21-6777-55be91688...@spamcop.net>: > Hi, > > I am setting up a mail relay cluster which main purpose is to maintain > the service ips via IPaddr2 and move them between cluster nodes when > necessary. > > The service

[ClusterLabs] Antw: Re: Antw: [EXT] Systemd resource started on node after reboot before cluster is stable ?

2023-02-16 Thread Ulrich Windl
>>> Adam Cecile schrieb am 16.02.2023 um 11:13 in >>> Nachricht <4f0c1203-fd4b-a1cc-4e2f-44384c720...@le-vert.net>: > On 2/16/23 07:57, Ulrich Windl wrote: >>>>> Adam Cecile schrieb am 15.02.2023 um 10:49 in >> Nachricht >> : [...] &g

[ClusterLabs] Antw: [EXT] Systemd resource started on node after reboot before cluster is stable ?

2023-02-15 Thread Ulrich Windl
>>> Adam Cecile schrieb am 15.02.2023 um 10:49 in Nachricht : > Hello, > > Just had some issue with unexpected server behavior after reboot. This > node was powered off, so cluster was running fine with this tomcat9 > resource running on a different machine. > > After powering on this node

[ClusterLabs] Antw: [EXT] Coming in Pacemaker 2.1.6: node attribute enhancements

2023-02-06 Thread Ulrich Windl
>>> Ken Gaillot schrieb am 06.02.2023 um 16:29 in >>> Nachricht <1fc864736b788762d00fbc0b78da1b34fc1137d3.ca...@redhat.com>: > Hi all, > > Node attributes will receive several enhacements in the Pacemaker 2.1.6 > release, expected in late May. They are available now in the main > branch. > > A

[ClusterLabs] Antw: [EXT] Re: Problem with MariaDB cluster

2023-01-27 Thread Ulrich Windl
>>> Reid Wahl schrieb am 27.01.2023 um 09:32 in Nachricht : > On Fri, Jan 27, 2023 at 12:23 AM Thomas CAS wrote: > >> Hello Reid, >> >> >> >> Thank you so much for your answer and bug report. >> >> If it is a bug, I do not understand why the problem is present in >> production but not on my lab

[ClusterLabs] Antw: [EXT] Load balancing, of a sort

2023-01-25 Thread Ulrich Windl
>>> Antony Stone schrieb am 25.01.2023 um 13:48 in Nachricht <202301251348.58372.antony.st...@ha.open.source.it>: > Hi. > > I have a corosync / pacemaker 3-node cluster with a resource group which can > > run on any node in the cluster. > > Every night a cron job on the node which is running

[ClusterLabs] Antw: [EXT] resource-agents v4.12.0

2023-01-25 Thread Ulrich Windl
>>> Oyvind Albrigtsen schrieb am 25.01.2023 um 10:50 in Nachricht <20230125095007.bxalry76nfo7m...@redhat.com>: > ClusterLabs is happy to announce resource-agents v4.12.0. > > Source code is available at: > https://github.com/ClusterLabs/resource-agents/releases/tag/v4.12.0 > > The most

[ClusterLabs] Antw: Re: Antw: [EXT] Re: corosync 2.4.4 version provide secure the communication by default

2023-01-23 Thread Ulrich Windl
>>> Jan Friesse schrieb am 23.01.2023 um 15:54 in >>> Nachricht : > On 23/01/2023 12:51, Ulrich Windl wrote: >>>>> Jan Friesse schrieb am 23.01.2023 um 10:20 in >>>>> Nachricht >> : >>> Hi, >>> >>> On 23/01/202

[ClusterLabs] Antw: [EXT] Re: corosync 2.4.4 version provide secure the communication by default

2023-01-23 Thread Ulrich Windl
>>> Jan Friesse schrieb am 23.01.2023 um 10:20 in >>> Nachricht : > Hi, > > On 23/01/2023 01:37, S Sathish S via Users wrote: >> Hi Team, >> >> corosync 2.4.4 version provide mechanism to secure the communication path > between nodes of a cluster by default? bcoz in our configuration secauth

[ClusterLabs] Antw: [EXT] Re: cibadmin response unexpected

2023-01-17 Thread Ulrich Windl
>>> Reid Wahl schrieb am 18.01.2023 um 05:46 in Nachricht : > On Tue, Jan 17, 2023 at 7:17 PM d tbsky wrote: >> >> Hi: >>I am using RHEL 9.1 with pacemaker-cli-2.1.4-5. I tried command below: >> >> > cibadmin -Q -o xxx >> >> I expect the result tell me that "xxx" scope is not exist, but the

[ClusterLabs] Antw: [EXT] Re: Failed 'virsh' call when test RA run by crm_resource (con't) - SOLVED!

2023-01-12 Thread Ulrich Windl
>>> Reid Wahl schrieb am 13.01.2023 um 07:55 in Nachricht : > On Thursday, January 12, 2023, Ulrich Windl < > ulrich.wi...@rz.uni-regensburg.de> wrote: >>>>> Reid Wahl schrieb am 12.01.2023 um 18:00 in > Nachricht >> : >>>

[ClusterLabs] Antw: [EXT] Re: Failed 'virsh' call when test RA run by crm_resource (con't) - SOLVED!

2023-01-12 Thread Ulrich Windl
>>> Reid Wahl schrieb am 12.01.2023 um 18:00 in Nachricht : > On Thu, Jan 12, 2023 at 6:24 AM Madison Kelly wrote: ... > Hooray! I'm really glad someone figured this out. > > Based on a link that was shared in another thread, maybe it worked > fine on my machine due to a newer polkit version.

[ClusterLabs] Antw: [EXT] Re: Failed 'virsh' call when test RA run by crm_resource (con't)

2023-01-11 Thread Ulrich Windl
>>> Madison Kelly schrieb am 12.01.2023 um 07:36 in Nachricht <2e68d19b-90e2-98e2-47ed-17c4f69df...@alteeve.com>: > On 2023-01-12 01:26, Reid Wahl wrote: ... > Appreciate the stab, didn't stop the hang though :( Well virsh has debugging options, too  > > -- > Madison Kelly > Alteeve's

[ClusterLabs] Antw: [EXT] Failed 'virsh' call when test RA run by crm_resource (con't)

2023-01-11 Thread Ulrich Windl
>>> Madison Kelly schrieb am 12.01.2023 um 05:10 in >>> Nachricht : > Hi all, > >There was a lot of sub-threads, so I figured it's helpful to start a > new thread with a summary so far. For context; I have a super simple > perl script that pretends to be an RA for the sake of debugging. >

[ClusterLabs] Antw: [EXT] Re: RA hangs when called by crm_resource (resending text format)

2023-01-11 Thread Ulrich Windl
>>> Madison Kelly schrieb am 11.01.2023 um 22:06 in Nachricht <8a2f2d45-0419-8e97-1805-2998a9b83...@alteeve.com>: > On 2023-01-11 01:13, Vladislav Bogdanov wrote: >> I suspect that valudate action is run as a non-root user. > > I modified the script to log the real and effective UIDs and it's >

[ClusterLabs] Antw: [EXT] pacemaker user question

2023-01-11 Thread Ulrich Windl
>>> "Jelen, Piotr" schrieb am 10.01.2023 um 14:51 >>> in Nachricht > HI , > > I would like to ask you if the hacluster and haclinet group is hardcoded > into the pacemaker or we can use other uid/gid than the standard 189/189? I'd consider any software that hard-codes UIDs or GID into

[ClusterLabs] Antw: [EXT] Re: RA hangs when called by crm_resource (resending text format)

2023-01-10 Thread Ulrich Windl
>>> Madison Kelly schrieb am 11.01.2023 um 06:21 in >>> Nachricht <74df2c8e-1cff-ba07-7f4a-070be296b...@alteeve.com>: > On 2023-01-11 00:14, Madison Kelly wrote: >> Hi all, >> >> Edit: Last message was in HTML format, sorry about that. >> >>I've got a hell of a weird problem, and I am

[ClusterLabs] Antw: [EXT] RA hangs when called by crm_resource

2023-01-10 Thread Ulrich Windl
Depending on the langiuage your RA is written in, you could debug it, or try ocf-tester to debug your RA. For shell scripts you could add some "ocf_log debug ..." statements. >>> Madison Kelly schrieb am 11.01.2023 um 06:11 in >>> Nachricht <06935f6a-a858-c8fe-7b81-168157e5c...@alteeve.com>: >

[ClusterLabs] Antw: Re: Antw: [EXT] clear_failcount operation times out, makes it impossible to use the cluster

2023-01-04 Thread Ulrich Windl
>>> Ulrich Windl schrieb am 05.01.2023 um 08:22 in Nachricht <63B67A9F.36B : 161 : 60728>: >>>> Krzysztof Bodora schrieb am 04.01.2023 um 09:51 > in Nachricht : > > It's an old installation, the error started appearing when one of the > > nodes was disco

[ClusterLabs] Antw: Re: Antw: [EXT] clear_failcount operation times out, makes it impossible to use the cluster

2023-01-04 Thread Ulrich Windl
ol-1-rule-expr-1) > Ordering Constraints: > Colocation Constraints: > > Resources Defaults: > resource-stickiness: 10 > Operations Defaults: > record-pending: true > > Cluster Properties: > batch-limit: 1 > cluster-infrastructure: corosync >

[ClusterLabs] Antw: [EXT] clear_failcount operation times out, makes it impossible to use the cluster

2023-01-02 Thread Ulrich Windl
Hi! I wonder: Is this a new installation, or is it a new bug in an old installation? For the first case I'd recommend to start with current software, and for the second case please describe what had changed or what had triggered the situation. Also provide basic configuration data, please.

[ClusterLabs] Antw: [EXT] Re: Stonith external/ssh "device"?

2022-12-21 Thread Ulrich Windl
>>> Antony Stone schrieb am 21.12.2022 um 17:19 in Nachricht <202212211719.34369.antony.st...@ha.open.source.it>: > On Wednesday 21 December 2022 at 16:59:16, Antony Stone wrote: > >> Hi. >> >> I'm implementing fencing on a 7‑node cluster as described recently: >>

[ClusterLabs] Antw: [EXT] Re: Bug pacemaker with multiple IP

2022-12-21 Thread Ulrich Windl
You could also try something like "watch fuser $(which ip)" or (if you can) write a program using inotify and IN_OPEN to see which procrees are opening the binary. >>> Thomas CAS schrieb am 21.12.2022 um 09:24 in Nachricht : > Ken, > > Antivirus (sophos‑av) is running but not in "real time

[ClusterLabs] Antw: [EXT] Re: Bug pacemaker with multiple IP

2022-12-21 Thread Ulrich Windl
Hi! I wonder: Could the error message be triggered by adding an exclusive manatory lock in the ip binary? If that triggers the bug, I'm rather sure that the error message is bad. Shouldn't that be EWOULDBLOCK then? (I have no idea how Sophos AV works, though. If they open the files to check in

[ClusterLabs] Antw: Re: Antw: [EXT] Re: Stonith

2022-12-21 Thread Ulrich Windl
>>> Ken Gaillot schrieb am 20.12.2022 um 16:21 in Nachricht <3a5960c2331f97496119720f6b5a760b3fe3bbcf.ca...@redhat.com>: > On Tue, 2022‑12‑20 at 11:33 +0300, Andrei Borzenkov wrote: >> On Tue, Dec 20, 2022 at 10:07 AM Ulrich Windl >> wrote: >> > > But k

[ClusterLabs] Antw: [EXT] Re: Bug pacemaker with multiple IP

2022-12-19 Thread Ulrich Windl
>>> Ken Gaillot schrieb am 19.12.2022 um 22:07 in >>> Nachricht <68e568cdcd675f8e4de8b01cd3ed27e2eb4ed28c.ca...@redhat.com>: ... > If simultaneous monitors is somehow causing the problem, you should be > able to work around it by using different intervals for different > monitors. I disagree:

[ClusterLabs] Antw: [EXT] Bug pacemaker with multiple IP

2022-12-19 Thread Ulrich Windl
>>> Thomas CAS schrieb am 19.12.2022 um 10:48 in Nachricht : > Hello Clusterlabs, > > I would like to report a bug on Pacemaker with the "IPaddr2" resource: > > OS: Debian 10 > Kernel: Linux wd‑websqlng01 4.19.0‑18‑amd64 #1 SMP Debian 4.19.208‑1 (2021‑09‑29) > x86_64 GNU/Linux > Pacemaker

[ClusterLabs] Antw: [EXT] Re: Stonith

2022-12-19 Thread Ulrich Windl
>>> Andrei Borzenkov schrieb am 19.12.2022 um 14:17 in Nachricht : > On Mon, Dec 19, 2022 at 4:01 PM Antony Stone > wrote: >> >> On Monday 19 December 2022 at 13:55:45, Andrei Borzenkov wrote: >> >> > On Mon, Dec 19, 2022 at 3:44 PM Antony Stone >> > >> > wrote: >> > > So, do I simply create

[ClusterLabs] Antw: [EXT] RFQ: Clusterlabs pacemaker administration

2022-12-15 Thread Ulrich Windl
Hi! First I think your e-mail disclaimer makes little sense when sending to a public list. Next I think it would be better to ask your question specific for the distribution you use (assuming you do use some common distribution). I'm sure Redhat and SUSE at least have some training offerings.

[ClusterLabs] Antw: [EXT] Re: mdraid - pacemaker resource agent

2022-12-09 Thread Ulrich Windl
>>> Roger Zhou via Users schrieb am 09.12.2022 um 11:26 in Nachricht : > On 12/9/22 17:36, Jelen, Piotr wrote: >> Hi Roger, >> >> Thank you for your quick reply, >> The mdraid resource agent works very well for us, >> Can you please tell me if there is any resource agent or tool build in >

[ClusterLabs] Antw: [EXT] Samba failover and Windows access

2022-12-07 Thread Ulrich Windl
>>> Dave Withheld schrieb am 08.12.2022 um 08:03 in Nachricht : > In our production factory, we run a 2‑node cluster on CentOS 8 with pacemaker, > a virtual IP, and drbd for shared storage with samba (among other services) > running as a resource on the active node. Everything works great

[ClusterLabs] Antw: [EXT] mdraid ‑ pacemaker resource agent

2022-12-07 Thread Ulrich Windl
>>> Ulrich Windl schrieb am 08.12.2022 um 07:55 in Nachricht <63918A51.502 : 161 : 60728>: >>>> "Jelen, Piotr" schrieb am 07.12.2022 um 11:44 in > Nachricht > om> > > > Hi ClusterLabs team , > > > > I would like to ask if thi

[ClusterLabs] Antw: [EXT] mdraid ‑ pacemaker resource agent

2022-12-07 Thread Ulrich Windl
>>> "Jelen, Piotr" schrieb am 07.12.2022 um 11:44 in Nachricht > Hi ClusterLabs team , > > I would like to ask if this resource agent was tested and if it can be use > in production? Hi! We use it in production for more than 10 years now. Of course it requires a little it of thinking when

[ClusterLabs] Antw: Re: Antw: [EXT] Preventing a resource from migrating to / starting on a node

2022-11-30 Thread Ulrich Windl
>>> Ken Gaillot schrieb am 29.11.2022 um 20:41 in Nachricht <23873b16397171f0880ee691a0aef5a6e3ef39a4.ca...@redhat.com>: > The use case here is for external code (DRBD) to ban a resource from a > node. While DRBD could add/remove location constraints, it's better to > have permanent (rule‑based)

[ClusterLabs] Antw: [EXT] Preventing a resource from migrating to / starting on a node

2022-11-28 Thread Ulrich Windl
Why can't you use a plain location constraint? >>> Madison Kelly schrieb am 29.11.2022 um 05:21 in >>> Nachricht <19cbecab-a7a0-5c3a-d074-efd3e8374...@alteeve.com>: > ___ > Manage your subscription: >

[ClusterLabs] Antw: Re: Antw: [EXT] DRBD Dual Primary Write Speed Extremely Slow

2022-11-14 Thread Ulrich Windl
>>> Tyler Phillippe via Users schrieb am 14.11.2022 um 15:48 in Nachricht : > Hi Vladislav, > > If I don't use the Scale-Out File Server, I don't have any issues with iSCSI > speeds: if I directly connect the LUN(s) to the individual servers, I get > 'full' speed - it just seems the Scale-Out

[ClusterLabs] Antw: [EXT] DRBD Dual Primary Write Speed Extremely Slow

2022-11-13 Thread Ulrich Windl
Hi! If you have planty of RAM you could configure an iSCSI disk using a ram disk and try how much I/O you get from there. Maybe you issue is not-su-much DRBD related. However when my local MD-RAID1 resyncs with about 120MB/s (spinning disks), the system also is hardly usable. Regards, Ulrich

[ClusterLabs] Antw: [EXT] Re: [External] : Re: Fence Agent tests

2022-11-06 Thread Ulrich Windl
>>> Jehan-Guillaume de Rorthais via Users schrieb am 05.11.2022 um 22:17 in Nachricht <20221105221756.20ea6761@karst>: ... > In Pacemaker mode, SBD is watching the two most important part of the > cluster: > Pacemaker and Corosync: > > * the "Pacemaker watcher" of SBD connects to the CIB and

[ClusterLabs] Antw: [EXT] Re: [External] : Re: Fence Agent tests

2022-11-06 Thread Ulrich Windl
Hi! Maybe see "test-watchdog" in sbd's manual page ;-) Regards, Ulrich >>> Robert Hayden schrieb am 05.11.2022 um 19:47 in Nachricht >> ‑Original Message‑ >> From: Users On Behalf Of Valentin Vidic >> via Users >> Sent: Saturday, November 5, 2022 1:07 PM >> To:

[ClusterLabs] Antw: [EXT] VirtualDomain did not stop although "crm resource stop"

2022-11-03 Thread Ulrich Windl
>>> "Lentes, Bernd" schrieb am 02.11.2022 um 18:05 in Nachricht <973456304.11152064.1667408710062.javamail.zim...@helmholtz-muenchen.de>: > Hi, > > i think i found the reason, but i want to be sure. > I wanted to stop a VirtualDomain and did a "crm resource stop ..." > But it didn't shut down.

[ClusterLabs] Antw: [EXT] Re: crm resource trace

2022-10-19 Thread Ulrich Windl
Hi! I didn't read all the logs, mostly because I think those are very hard to read for humans. But I noticed: config="/mnt/share/vm_genetrap.xml" I think using /mnt as permanent path for anything is a bad idea. Regards, Ulrich >>> "Lentes, Bernd" schrieb am 17.10.2022 >>> um 12:43 in

[ClusterLabs] Antw: [EXT] Re: trace of resource ‑ sometimes restart, sometimes not

2022-10-18 Thread Ulrich Windl
>>> Ken Gaillot schrieb am 07.10.2022 um 01:08 in Nachricht : [...] > > trace_ra is unusual in that it's supported automatically by the OCF > shell functions, rather than by the agents directly. That means it's > not advertised in metadata. Otherwise agents could mark it as > reloadable, and

[ClusterLabs] Antw: [EXT] trace of resource ‑ sometimes restart, sometimes not

2022-10-18 Thread Ulrich Windl
>>> "Lentes, Bernd" schrieb am 06.10.2022 um 21:05 in Nachricht <280206366.19581344.1665083124300.javamail.zim...@helmholtz-muenchen.de>: > Hi, > > i have some problems with our DLM, so i wanted to trace it. Yesterday i just > set a trace for "monitor". No restart of DLM afterwards. It went

[ClusterLabs] Antw: [EXT] Re: DRBD and SQL Server

2022-09-26 Thread Ulrich Windl
>>> Brian schrieb am 26.09.2022 um 12:10 in Nachricht <5cb7ca0c-c502-44a9-8c74-c14d87f5e...@tenethor.ddns.net>: > Eric > > Up until recently I was running mariaDB on my cluster with a DRBD storage. > As for the server it ran perfectly fine in my use. Just a home server with > not too much

[ClusterLabs] RFE: sdb clone

2022-09-20 Thread Ulrich Windl
Hi! I have a proposal (request) for enhancing sbd: (I'm not suggesting a complete rewrite with reasonable options, as I had don that before already ;-)) When configuring an additional disk device, it would be quite handy to be able to "clone" the configuration from an existing device. As I

[ClusterLabs] Antw: [EXT] DC marks itself as OFFLINE, continues orchestrating the other nodes

2022-09-08 Thread Ulrich Windl
>>> Lars Ellenberg schrieb am 08.09.2022 um 15:01 in Nachricht : > Scenario: > three nodes, no fencing (I know) > break network, isolating nodes > unbreak network, see how cluster partitions rejoin and resume service > > > Funny outcome: > /usr/sbin/crm_mon ‑x pe‑input‑689.bz2 > Cluster

[ClusterLabs] Antw: [EXT] (no subject)

2022-09-07 Thread Ulrich Windl
>>> ??? schrieb am 07.09.2022 um 12:12 in Nachricht : > Hello. > I am a student who wants to implement a redundancy system with raspberry pi. > Last time, I posted about how to proceed with installation on raspberry pi > and received a lot of comments. > Among them, I searched a lot after looking

[ClusterLabs] Antw: [EXT] Re: Cluster does not start resources

2022-08-25 Thread Ulrich Windl
>>> "Lentes, Bernd" schrieb am 24.08.2022 >>> um 18:18 in Nachricht <949048121.170067950.1661357898189.javamail.zim...@helmholtz-muenchen.de>: > - On 24 Aug, 2022, at 16:26, kwenning kwenn...@redhat.com wrote: > >>> >>> if I get Ulrich right - and my fading memory of when I really used

[ClusterLabs] Antw: [EXT] Re: Cluster does not start resources

2022-08-25 Thread Ulrich Windl
>>> "Lentes, Bernd" schrieb am 24.08.2022 >>> um 18:15 in Nachricht <518321906.170067472.1661357732466.javamail.zim...@helmholtz-muenchen.de>: > - On 24 Aug, 2022, at 16:26, kwenning kwenn...@redhat.com wrote: > > > >> >> Guess the resources running now are those you tried to enable

[ClusterLabs] Antw: [EXT] Re: Cluster does not start resources

2022-08-24 Thread Ulrich Windl
>>> "Lentes, Bernd" schrieb am 24.08.2022 >>> um 15:14 in Nachricht <2097315299.169939002.1661346840678.javamail.zim...@helmholtz-muenchen.de>: > Hi, > > > Now with "crm resource start" all resources started. I didn't change > anything !?! I guess that command set the roles of all resources

[ClusterLabs] Antw: [EXT] Re: Cluster does not start resources

2022-08-24 Thread Ulrich Windl
>>> "Lentes, Bernd" schrieb am 24.08.2022 >>> um 07:13 in Nachricht <718962332.169073979.1661318038326.javamail.zim...@helmholtz-muenchen.de>: > > - On 24 Aug, 2022, at 07:03, arvidjaar arvidj...@gmail.com wrote: > >> On 24.08.2022 07:34, Lentes, Bernd wrote: >>> >>> >>> - On 24

[ClusterLabs] Antw: [EXT] Re: Cluster does not start resources

2022-08-24 Thread Ulrich Windl
>>> Reid Wahl schrieb am 24.08.2022 um 05:33 in Nachricht : > On Tue, Aug 23, 2022 at 7:10 PM Lentes, Bernd > wrote: ... > The stop‑all‑resources cluster property is set to true. Is that intentional? ... Independent of the problem, I suggest that a better log message is created. Maybe like

[ClusterLabs] Antw: [EXT] Re: Start resource only if another resource is stopped

2022-08-19 Thread Ulrich Windl
>>> Andrei Borzenkov schrieb am 18.08.2022 um 20:26 in Nachricht : ... > It is almost always wrong to have multiple independent pacemaker > resources managing the same underlying physical resource. It's not the cluster bible that says: "No man can have two masters" ;-) But it applies to clusters

[ClusterLabs] Antw: [EXT] Re: Start resource only if another resource is stopped

2022-08-12 Thread Ulrich Windl
>>> Andrei Borzenkov schrieb am 11.08.2022 um 21:40 in Nachricht : > On 11.08.2022 17:34, Miro Igov wrote: >> Hello, >> >> I am trying to create failover resource that would start if another resource >> is stopped and stop when the resource is started back. I wonder: Could it be implemented

[ClusterLabs] Antw: Re: Antw: [EXT] node1 and node2 communication time question

2022-08-10 Thread Ulrich Windl
>>> Andrei Borzenkov schrieb am 10.08.2022 um 10:13 in Nachricht <2c124efb-bc28-224b-031c-3ac2c111a...@gmail.com>: > On 10.08.2022 09:37, Ulrich Windl wrote: >> Unfortunately the documentation for fencing agents leaves verymuch to be > desired: >> When I tried to

[ClusterLabs] Antw: [EXT] node1 and node2 communication time question

2022-08-10 Thread Ulrich Windl
>>> ??? schrieb am 10.08.2022 um 03:35 in Nachricht : > Thank you for your reply. > Then, could you explain how to activate and set the stonith? There is no universal solution; instead it depends on the hardware you have. (If your devices are close to each other, one device's output port could

[ClusterLabs] Antw: Heads up for ldirectord in SLES12 SP5 "Use of uninitialized value $ip_port in pattern match (m//) at /usr/sbin/ldirectord line 1830"

2022-08-09 Thread Ulrich Windl
erver"); } One could check for "== '-2'" instead, but still in the other case there is no valid port value. Ideas? Regards, Ulrich >>> Ulrich Windl schrieb am 08.08.2022 um 11:19 in Nachricht <62F0D518.3F8 : >>> 161 : 60728>: > Hi! > >

[ClusterLabs] Antw: [EXT] Re: 2‑Node Cluster ‑ fencing with just one node running ?

2022-08-08 Thread Ulrich Windl
>>> Reid Wahl schrieb am 07.08.2022 um 04:01 in Nachricht : > On Saturday, August 6, 2022, Strahil Nikolov via Users < > users@clusterlabs.org> wrote: >> By the way I remember a lot of problems with fence_ilo & fence_ilo_ssh > (due to ILO). >> If you receive timeouts use fence_ipmi (you have to

[ClusterLabs] Antw: Heads up for ldirectord in SLES12 SP5 "Use of uninitialized value $ip_port in pattern match (m//) at /usr/sbin/ldirectord line 1830"

2022-08-08 Thread Ulrich Windl
As ldirectord uses a SIGTERM handler that sets a flag only and then (at some later time) the termination code will be started. Doesn't that mean the cluster will see a bad exit code (success while parts of ldirectord are still running)? Regards, Ulrich >>> Ulrich Windl schrieb am 03.

[ClusterLabs] Antw: [EXT] Heads up for ldirectord in SLES12 SP5 "Use of uninitialized value $ip_port in pattern match (m//) at /usr/sbin/ldirectord line 1830"

2022-08-04 Thread Ulrich Windl
_port is lost, # so it cannot be part of the error message } Despite of that is that the critical part was that the "stop" operation SEEMED to have failed, causing fencing. Regardless of the success of resolving the names ldirector should be able to stop! ---

[ClusterLabs] Antw: [EXT] cluster log not unambiguous about state of VirtualDomains

2022-08-04 Thread Ulrich Windl
>>> "Lentes, Bernd" schrieb am 03.08.2022 um 17:01 in Nachricht <987208047.150503130.1659538882681.javamail.zim...@helmholtz-muenchen.de>: > Hi, > > i have a strange behaviour found in the cluster log > (/var/log/cluster/corosync.log). > I KNOW that i put one node (ha-idg-2) in standby mode and

[ClusterLabs] Antw: [EXT] Re: Q: About a false negative of storage_mon

2022-08-03 Thread Ulrich Windl
>>> Klaus Wenninger schrieb am 03.08.2022 um 15:51 in Nachricht : > On Tue, Aug 2, 2022 at 4:10 PM Ken Gaillot wrote: >> >> On Tue, 2022-08-02 at 19:13 +0900, 井上和徳 wrote: >> > Hi, >> > >> > Since O_DIRECT is not specified in open() [1], it reads the buffer >> > cache and >> > may result in a

[ClusterLabs] Heads up for ldirectord in SLES12 SP5 "Use of uninitialized value $ip_port in pattern match (m//) at /usr/sbin/ldirectord line 1830"

2022-08-03 Thread Ulrich Windl
Hi! I wanted to inform you of an unpleasant bug in ldirectord of SLES12 SP5: We had a short network problem while some redundancy paths reconfigured in the infrastructure, effectively causing that some network services could not be reached. Unfortunately ldirectord controlled by the cluster

[ClusterLabs] Antw: Re: Antw: [EXT] Re: Q: About a false negative of storage_mon

2022-08-03 Thread Ulrich Windl
>>> Andrei Borzenkov schrieb am 03.08.2022 um 08:58 in Nachricht : > On 03.08.2022 09:02, Ulrich Windl wrote: >>>>> Ken Gaillot schrieb am 02.08.2022 um 16:09 in >> Nachricht >> <0a2125a43bbfc09d2ca5bad1a693710f00e33731.ca...@redhat.com>: >

[ClusterLabs] Antw: Antw: [EXT] Re: Q: About a false negative of storage_mon

2022-08-03 Thread Ulrich Windl
>>> "Ulrich Windl" schrieb am 03.08.2022 um 08:02 in Nachricht <62ea0f6202a10004c...@gwsmtp.uni-regensburg.de>: >>>> Ken Gaillot schrieb am 02.08.2022 um 16:09 in > Nachricht > <0a2125a43bbfc09d2ca5bad1a693710f00e33731.ca...@redhat.com>:

[ClusterLabs] Antw: [EXT] Re: Q: About a false negative of storage_mon

2022-08-03 Thread Ulrich Windl
>>> Ken Gaillot schrieb am 02.08.2022 um 16:09 in Nachricht <0a2125a43bbfc09d2ca5bad1a693710f00e33731.ca...@redhat.com>: > On Tue, 2022-08-02 at 19:13 +0900, 井上和徳 wrote: >> Hi, >> >> Since O_DIRECT is not specified in open() [1], it reads the buffer >> cache and >> may result in a false

[ClusterLabs] Antw: [EXT] Re: Q: About a false negative of storage_mon

2022-08-02 Thread Ulrich Windl
>>> "Fabio M. Di Nitto" schrieb am 02.08.2022 um 14:30 in Nachricht <0b26c097-1e21-3945-24ba-355cd0ccf...@fabbione.net>: > Hello Kazunori-san, > > On 02/08/2022 12.13, 井上和徳 wrote: >> Hi, >> >> Since O_DIRECT is not specified in open() [1], it reads the buffer cache and >> may result in a false

[ClusterLabs] Antw: [EXT] Re: QDevice not found after reboot but appears after cluster restart

2022-08-01 Thread Ulrich Windl
>>> "john tillman" schrieb am 29.07.2022 um 22:51 in Nachricht : >> > On Thursday 28 July 2022 at 22:17:01, john tillman wrote: >>> I have a two cluster setup with a qdevice. 'pcs quorum status' from a cluster node shows the qdevice casting a vote. On the qdevice node

[ClusterLabs] Antw: [EXT] IPaddr2 resource times out and cant be killed

2022-08-01 Thread Ulrich Windl
>>> Ross Sponholtz schrieb am 29.07.2022 um 21:51 in Nachricht > I’m running a RHEL pacemaker cluster on Azure, and I’ve gotten a failure & > fencing where I get these messages in the log file: > > warning: vip_ABC_30_monitor_1 process (PID 1779737) timed out > crit:

[ClusterLabs] Antw: Re: [EXT] Problem with DLM

2022-07-28 Thread Ulrich Windl
>>> "Lentes, Bernd" schrieb am 26.07.2022 >>> um 21:36 in Nachricht <1994685463.141245271.1658864186207.javamail.zim...@helmholtz-muenchen.de>: > > - On 26 Jul, 2022, at 20:06, Ulrich Windl > ulrich.wi...@rz.uni-regensburg.de wrote: > >

Re: [ClusterLabs] [EXT] Problem with DLM

2022-07-26 Thread Ulrich Windl
Hi Bernd! I think the answer may be some time before the timeout was reported; maybe a network issue? Or a very high load. It's hard to say from the logs... >>> Am 26.07.2022 um 15:32, in Nachricht <6ABA7762.4E4 : 205 : 62692>, "Lentes, Bernd" schrieb: Hi, it seems my DLM went grazy:

[ClusterLabs] Antw: [EXT] Cannot add a node with pcs

2022-07-13 Thread Ulrich Windl
>>> Piotr Szafarczyk schrieb am 12.07.2022 um 12:34 in Nachricht <38ccc24a-7b01-561c-20f8-ec2273a18...@netexpert.pl>: > Hi, > > I used to have a working cluster with 3 nodes (and stonith disabled). THE SLES guide says: Important: No Support Without STONITH You must have a node fencing

[ClusterLabs] Antw: [EXT] nfs mount resource won't stop - lazy option ?

2022-07-11 Thread Ulrich Windl
>>> Patrick Vranckx schrieb am 08.07.2022 um >>> 11:25 in Nachricht <69422771-c57d-d910-8e92-8a2345847...@uclouvain.be>: > Hi, > > I have a cluster configures as such: > > Online: [ mbacktm1 mbacktm2 ] > > Full list of resources: > > stonith-mbt(stonith:fence_scsi):Started mbacktm2

[ClusterLabs] Antw: [EXT] Questions related to failover time

2022-07-11 Thread Ulrich Windl
>>> ??? schrieb am 08.07.2022 um 04:30 in Nachricht : > Hello. > I am a college student who wants to apply facemaker and corosync to > raspberry pi. > What I'm going to do is reduce the failover time of two raspberry pi. Reduce from what to what? > So I have a question. > Is there a way to

[ClusterLabs] Antw: [EXT] is there a way to cancel a running live migration or a "resource stop" ?

2022-07-11 Thread Ulrich Windl
>>> "Lentes, Bernd" schrieb am 07.07.2022 um 11:28 in Nachricht <2146740519.119706769.1657186111711.javamail.zim...@helmholtz-muenchen.de>: > Hi, > > is there a way to cancel a running live migration or a "resource stop" ? And the effect of canceling would be what? Well, the operation to cancel

[ClusterLabs] Antw: [EXT] FYI: one more regression introduced in Pacemaker 2.1.3

2022-07-11 Thread Ulrich Windl
>>> Ken Gaillot schrieb am 27.06.2022 um 23:07 in Nachricht : > Hi all, > > Another regression was found that was introduced in Pacemaker 2.1.3. > > As part of trying to make the output of various commands more > consistent, as of 2.1.3, "crm_attribute ‑‑quiet ‑‑query" prints > "(null)" for an

[ClusterLabs] Antw: [EXT] modified RA can't be used

2022-07-11 Thread Ulrich Windl
>>> "Lentes, Bernd" schrieb am 27.06.2022 >>> um 14:54 in Nachricht <273609196.108116045.1656334469407.javamail.zim...@helmholtz-muenchen.de>: > Hi, > > i adapted the RA ocf/heartbeat/VirtualDomain to my needs and renamed it to > VirtualDomain.ssh > When i try to use it now, i get an error

[ClusterLabs] Antw: [EXT] Question regarding the security of corosync

2022-06-21 Thread Ulrich Windl
>>> Mario Freytag schrieb am 17.06.2022 um 11:39 in Nachricht : > Dear sirs, or madams, > > I’d like to ask about the security of corosync. We’re using a Proxmox HA > setup in our testing environment and need to confirm it’s compliance with PCI > guidelines. > > We have a few questions: > >

[ClusterLabs] Antw: [EXT] related to fencing in general , docker containers

2022-06-20 Thread Ulrich Windl
I had pointed out a log time ago that the existing documentation on fencing agents is practically unusable. >>> Sridhar K schrieb am 17.06.2022 um 15:53 in Nachricht : > Hi Team, > > Please share any pointers, references, example usage's w.r.t fencing in > general and its use w.r.t docker

[ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] Re: Why not retry a monitor (pacemaker‑execd) that got a segmentation fault?

2022-06-15 Thread Ulrich Windl
>>> Klaus Wenninger schrieb am 15.06.2022 um 13:22 in Nachricht : > On Wed, Jun 15, 2022 at 10:33 AM Ulrich Windl > wrote: >> ... >> (As said above it may be some RAM corruption where SMI (system management >> interrupts, or so) play a role, but Dell says the

[ClusterLabs] Antw: Re: Antw: [EXT] Re: Why not retry a monitor (pacemaker‑execd) that got a segmentation fault?

2022-06-15 Thread Ulrich Windl
>>> Klaus Wenninger schrieb am 15.06.2022 um 10:00 in Nachricht : > On Wed, Jun 15, 2022 at 8:32 AM Ulrich Windl > wrote: >> >> >>> Ulrich Windl schrieb am 14.06.2022 um 15:53 in Nachricht <62A892F0.174 : 161 > : >> 60728>: >> >>

[ClusterLabs] Antw: [EXT] Re: Why not retry a monitor (pacemaker‑execd) that got a segmentation fault?

2022-06-15 Thread Ulrich Windl
>>> Ulrich Windl schrieb am 14.06.2022 um 15:53 in Nachricht <62A892F0.174 : >>> 161 : 60728>: ... > Yes it's odd, but isn't the cluster just to protect us from odd situations? > ;-) I have more odd stuff: Jun 14 20:40:09 rksaph18 p

[ClusterLabs] Antw: [EXT] Re: Why not retry a monitor (pacemaker‑execd) that got a segmentation fault?

2022-06-14 Thread Ulrich Windl
>>> Ken Gaillot schrieb am 14.06.2022 um 15:49 in Nachricht : > On Tue, 2022‑06‑14 at 14:36 +0200, Ulrich Windl wrote: >> Hi! >> >> I had a case where a VirtualDomain monitor operation ended in a core >> dump (actually it was pacemaker‑execd, but

[ClusterLabs] Why not retry a monitor (pacemaker-execd) that got a segmentation fault?

2022-06-14 Thread Ulrich Windl
Hi! I had a case where a VirtualDomain monitor operation ended in a core dump (actually it was pacemaker-execd, but it counted as "monitor" operation), and the cluster decided to restart the VM. Wouldn't it be worth to retry the monitor operation first? Chances are that a re-tried monitor

[ClusterLabs] Antw: [EXT] crm status shows CURRENT DC as None

2022-06-14 Thread Ulrich Windl via Users
>>> Priyanka Balotra schrieb am 14.06.2022 um >>> 07:40 in Nachricht : > Hi Folks, > > crm status shows CURRENT DC as None. Please check and let us know why the > current DC is not pointing to any of the nodes. > Maybe present the complete output of "crm_mon -1Arfj". For example where are

[ClusterLabs] Antw: [EXT] fencing configuration

2022-06-07 Thread Ulrich Windl
>>> Zoran Bošnjak schrieb am 07.06.2022 um 10:26 in Nachricht <1951254459.265.1654590407828.javamail.zim...@via.si>: > Hi, I need some help with correct fencing configuration in 5‑node cluster. > > The speciffic issue is that there are 3 rooms, where in addition to node > failure scenario, each

[ClusterLabs] Antw: [EXT] Re: normal reboot with active sbd does not work

2022-06-06 Thread Ulrich Windl
>>> Andrei Borzenkov schrieb am 03.06.2022 um 17:04 in Nachricht <99f7746a-c962-33bb-6737-f88ba0128...@gmail.com>: > On 03.06.2022 16:51, Zoran Bošnjak wrote: >> Thanks for all your answers. Sorry, my mistake. The ipmi_watchdog is indeed > OK. I was first experimenting with "softdog", which is

[ClusterLabs] Antw: Re: Antw: [EXT] normal reboot with active sbd does not work

2022-06-06 Thread Ulrich Windl
would reset the alert, but it would also cause interrupted IPMI communication. ipmitool -I open sel clear may work, too, but it will clear the event log. Regards, Ulrich > > "echo V >/dev/watchdog" makes no difference. > > - Original Message - > From: &quo

[ClusterLabs] Antw: [EXT] Re: normal reboot with active sbd does not work

2022-06-03 Thread Ulrich Windl
>>> Klaus Wenninger schrieb am 03.06.2022 um 11:03 in Nachricht : > On Fri, Jun 3, 2022 at 10:19 AM Zoran Bošnjak wrote: ... > still opened by sbd. In general I don't see why the watchdog-module should > be unloaded upon shutdown. So as a first try you just might remove that Spcifically if the

[ClusterLabs] Antw: [EXT] normal reboot with active sbd does not work

2022-06-03 Thread Ulrich Windl
>>> Zoran Bošnjak schrieb am 03.06.2022 um 10:18 in Nachricht <2046503996.272.1654244336372.javamail.zim...@via.si>: > Hi all, > I would appreciate an advice about sbd fencing (without shared storage). Not an answer, but curiosity: As sbd needs very little space (like just 1MB), did anybody ever

[ClusterLabs] Q: "Multiple attributes match name=target-role"

2022-06-02 Thread Ulrich Windl
Hi! We had some issue with a resource in SLES15 SP3, so I cleaned the error: h16:~ # crm_resource -C -r prm_xen_v07 -N h18 -n start Cleaned up prm_xen_v07 on h18 Multiple attributes match name=target-role Value: Started(id=prm_xen_v07-meta_attributes-target-role) Value: Started

[ClusterLabs] What's the number in "Servant pcmk is outdated (age: 682915)"

2022-06-01 Thread Ulrich Windl
Hi! I'm wondering what the number in parentheses is for these messages: sbd[6809]: warning: inquisitor_child: pcmk health check: UNHEALTHY sbd[6809]: warning: inquisitor_child: Servant pcmk is outdated (age: 682915) Regards, Ulrich ___ Manage your

[ClusterLabs] Antw: [EXT] No node name in corosync‑cmapctl output

2022-05-31 Thread Ulrich Windl
>>> Andreas Hasenack schrieb am 31.05.2022 um 15:16 in Nachricht : > Hi, > > corosync 3.1.6 > pacemaker 2.1.2 > crmsh 4.3.1 > > TL;DR > I only seem to get a "name" attribute in the "corosync‑cmapctl | grep > nodelist" output if I set an explicit name in corosync.conf's > nodelist. If I rely on

[ClusterLabs] More pacemaker oddities while stopping DC

2022-05-25 Thread Ulrich Windl
Hi! We are still suffering from kernel RAM corruption on the Xen hypervisor when a VM or the hypervisor is doing I/O (three months since the bug report at SUSE, but no fix or workaround meaning the whole Xen cluster project was canceled after 20 years, but that's a different topic). All VMs

[ClusterLabs] Antw: [EXT] What/how to clean up when bootstrapping new cluster (or: I have a phantom node)

2022-05-24 Thread Ulrich Windl
>>> Andreas Hasenack schrieb am 24.05.2022 um 22:05 in Nachricht : > Hi, > > I'm trying to find out the correct steps to start a corosync/pacemaker > cluster right after installing its packages in Debian or Ubuntu. > > I'm not using crmsh or pcs on purpose, I really wanted to get this > basic

[ClusterLabs] Antw: Re: Antw: [EXT] Re: Cluster unable to find back together

2022-05-24 Thread Ulrich Windl
>>> Klaus Wenninger schrieb am 23.05.2022 um 19:43 in Nachricht : > On Fri, May 20, 2022 at 7:43 AM Ulrich Windl > wrote: >> >> >>> Jan Friesse schrieb am 19.05.2022 um 14:55 in >> Nachricht >> <1abb8468-6619-329f-cb01-3f51112db...@redhat.com

  1   2   3   4   5   6   7   8   9   10   >