Re: [ClusterLabs] issue during Pacemaker failover testing

2023-09-04 Thread Andrei Borzenkov
On Mon, Sep 4, 2023 at 4:44 PM David Dolan wrote: > > Thanks Klaus\Andrei, > > So if I understand correctly what I'm trying probably shouldn't work. It is impossible to configure corosync (or any other cluster system for that matter) to keep the *arbitrary* last node quorate. It is possible to

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-09-04 Thread David Dolan
Thanks Klaus\Andrei, So if I understand correctly what I'm trying probably shouldn't work. And I should attempt setting auto_tie_breaker in corosync and remove last_man_standing. Then, I should set up another server with qdevice and configure that using the LMS algorithm. Thanks David On Mon, 4

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-09-04 Thread Klaus Wenninger
On Mon, Sep 4, 2023 at 1:50 PM Andrei Borzenkov wrote: > On Mon, Sep 4, 2023 at 2:18 PM Klaus Wenninger > wrote: > > > > > > > > On Mon, Sep 4, 2023 at 12:45 PM David Dolan > wrote: > >> > >> Hi Klaus, > >> > >> With default quorum options I've performed the following on my 3 node > cluster >

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-09-04 Thread Klaus Wenninger
On Mon, Sep 4, 2023 at 1:44 PM Andrei Borzenkov wrote: > On Mon, Sep 4, 2023 at 2:25 PM Klaus Wenninger > wrote: > > > > > > Or go for qdevice with LMS where I would expect it to be able to really > go down to > > a single node left - any of the 2 last ones - as there is still qdevice.# > > Sry

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-09-04 Thread Andrei Borzenkov
On Mon, Sep 4, 2023 at 2:18 PM Klaus Wenninger wrote: > > > > On Mon, Sep 4, 2023 at 12:45 PM David Dolan wrote: >> >> Hi Klaus, >> >> With default quorum options I've performed the following on my 3 node cluster >> >> Bring down cluster services on one node - the running services migrate to >>

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-09-04 Thread Andrei Borzenkov
On Mon, Sep 4, 2023 at 2:25 PM Klaus Wenninger wrote: > > > Or go for qdevice with LMS where I would expect it to be able to really go > down to > a single node left - any of the 2 last ones - as there is still qdevice.# > Sry for the confusion btw. > According to documentation, "LMS is also

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-09-04 Thread Klaus Wenninger
On Mon, Sep 4, 2023 at 1:18 PM Klaus Wenninger wrote: > > > On Mon, Sep 4, 2023 at 12:45 PM David Dolan wrote: > >> Hi Klaus, >> >> With default quorum options I've performed the following on my 3 node >> cluster >> >> Bring down cluster services on one node - the running services migrate to >>

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-09-04 Thread Klaus Wenninger
On Mon, Sep 4, 2023 at 12:45 PM David Dolan wrote: > Hi Klaus, > > With default quorum options I've performed the following on my 3 node > cluster > > Bring down cluster services on one node - the running services migrate to > another node > Wait 3 minutes > Bring down cluster services on one of

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-09-04 Thread Andrei Borzenkov
On Mon, Sep 4, 2023 at 1:45 PM David Dolan wrote: > > Hi Klaus, > > With default quorum options I've performed the following on my 3 node cluster > > Bring down cluster services on one node - the running services migrate to > another node > Wait 3 minutes > Bring down cluster services on one of

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-09-04 Thread David Dolan
Hi Klaus, With default quorum options I've performed the following on my 3 node cluster Bring down cluster services on one node - the running services migrate to another node Wait 3 minutes Bring down cluster services on one of the two remaining nodes - the surviving node in the cluster is then

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-08-31 Thread David Dolan
I just tried removing all the quorum options setting back to defaults so no last_man_standing or wait_for_all. I still see the same behaviour where the third node is fenced if I bring down services on two nodes. Thanks David On Thu, 31 Aug 2023 at 11:44, Klaus Wenninger wrote: > > > On Thu, Aug

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-08-31 Thread Klaus Wenninger
On Thu, Aug 31, 2023 at 12:28 PM David Dolan wrote: > > > On Wed, 30 Aug 2023 at 17:35, David Dolan wrote: > >> >> >> > Hi All, >>> > >>> > I'm running Pacemaker on Centos7 >>> > Name: pcs >>> > Version : 0.9.169 >>> > Release : 3.el7.centos.3 >>> > Architecture: x86_64 >>> >

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-08-31 Thread David Dolan
On Wed, 30 Aug 2023 at 17:35, David Dolan wrote: > > > > Hi All, >> > >> > I'm running Pacemaker on Centos7 >> > Name: pcs >> > Version : 0.9.169 >> > Release : 3.el7.centos.3 >> > Architecture: x86_64 >> > >> > >> Besides the pcs-version versions of the other

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-08-30 Thread Andrei Borzenkov
On 30.08.2023 19:23, David Dolan wrote: Use fencing. Quorum is not a replacement for fencing. With (reliable) fencing you can simply run pacemaker with no-quorum-policy=ignore. The practical problem is that usually the last resort that will work in all cases is SBD + suicide and SBD cannot

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-08-30 Thread David Dolan
> Hi All, > > > > I'm running Pacemaker on Centos7 > > Name: pcs > > Version : 0.9.169 > > Release : 3.el7.centos.3 > > Architecture: x86_64 > > > > > Besides the pcs-version versions of the other cluster-stack-components > could be interesting. (pacemaker, corosync) > rpm -qa |

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-08-30 Thread David Dolan
> > > > > Hi All, > > > > I'm running Pacemaker on Centos7 > > Name: pcs > > Version : 0.9.169 > > Release : 3.el7.centos.3 > > Architecture: x86_64 > > > > > > I'm performing some cluster failover tests in a 3 node cluster. We have > 3 resources in the cluster. > > I was trying to

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-08-30 Thread Klaus Wenninger
On Wed, Aug 30, 2023 at 2:34 PM David Dolan wrote: > Hi All, > > I'm running Pacemaker on Centos7 > Name: pcs > Version : 0.9.169 > Release : 3.el7.centos.3 > Architecture: x86_64 > > Besides the pcs-version versions of the other cluster-stack-components could be interesting.

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-08-30 Thread Andrei Borzenkov
On Wed, Aug 30, 2023 at 3:34 PM David Dolan wrote: > > Hi All, > > I'm running Pacemaker on Centos7 > Name: pcs > Version : 0.9.169 > Release : 3.el7.centos.3 > Architecture: x86_64 > > > I'm performing some cluster failover tests in a 3 node cluster. We have 3 > resources in the

[ClusterLabs] issue during Pacemaker failover testing

2023-08-30 Thread David Dolan
Hi All, I'm running Pacemaker on Centos7 Name: pcs Version : 0.9.169 Release : 3.el7.centos.3 Architecture: x86_64 I'm performing some cluster failover tests in a 3 node cluster. We have 3 resources in the cluster. I was trying to see if I could get it working if 2 nodes fail at

Re: [ClusterLabs] issue with awscli profile for AWS resource agents

2021-07-09 Thread kgaillot
On Thu, 2021-07-08 at 13:18 +, Aaron Kennedy wrote: > > Hello, > > I am trying to use AWS resource agents such as ‘awsvip’ and ‘awseip’ > but my awscli profile “could not be found” > > [ec2-user@ip-172-31-43-116 ~]$ sudo pcs resource debug-start --full > privip

[ClusterLabs] issue with awscli profile for AWS resource agents

2021-07-08 Thread Aaron Kennedy
Hello, I am trying to use AWS resource agents such as ‘awsvip’ and ‘awseip’ but my awscli profile “could not be found” [ec2-user@ip-172-31-43-116 ~]$ sudo pcs resource debug-start --full privip warning: unpack_rsc_op_failure:Processing failed start of privip on

Re: [ClusterLabs] Issue with Pacemaker config related to VIP and an LSB resource

2021-06-16 Thread Michael Romero
“But in general I guess the idea of rechecking resource after failure timeout once (similar to initial probe) sounds interesting. It could be more robust in that resource agent could check whether resource start is possible now at all and prevent unsuccessful attempt to migrate resource back to

Re: [ClusterLabs] Issue with Pacemaker config related to VIP and an LSB resource

2021-06-15 Thread Andrei Borzenkov
On 16.06.2021 01:49, Michael Romero wrote: > > At which point an administrator or an automated script could intervene If you are going to always use manual intervention outside of pacemaker, just leave failure timeout on default 0 so cluster will never clear failure count automatically on a

Re: [ClusterLabs] Issue with Pacemaker config related to VIP and an LSB resource

2021-06-15 Thread Andrei Borzenkov
On 16.06.2021 01:49, Michael Romero wrote: > Hello, > > I currently have Pacemaker v2.0.3-3ubuntu4.2 running on two Ubuntu 20.04 > LTS systems. My config consists of two service groups, both of which have > an LSB resource and a floating IP resource. The LSB resource is > configured with a

[ClusterLabs] Issue with Pacemaker config related to VIP and an LSB resource

2021-06-15 Thread Michael Romero
Hello, I currently have Pacemaker v2.0.3-3ubuntu4.2 running on two Ubuntu 20.04 LTS systems. My config consists of two service groups, both of which have an LSB resource and a floating IP resource. The LSB resource is configured with a monitor operation, so that "/etc/init.d/ status" is ran in

Re: [ClusterLabs] issue

2020-11-17 Thread Reid Wahl
n has been defined for the resource, so Pacemaker is not even checking the resource's status. If this is the case, then a monitor operation needs to be added. > > ------ Forwarded message - > From: Guy Przytula > Date: Tue, Nov 17, 2020 at 2:33 AM > Subject: Re: [ClusterLabs]

Re: [ClusterLabs] issue

2020-11-17 Thread Reid Wahl
Guy emailed me again directly, stating that the mailing list does not accept URLs. Guy, I'm honestly not sure what you're referring to, since Ken and I have both posted URLs within this thread. ~~~ we downloaded pacemaker from :

Re: [ClusterLabs] issue

2020-11-17 Thread Ken Gaillot
On Tue, 2020-11-17 at 08:23 +0100, Guy Przytula wrote: > sorry for coming back and thanks for the answers > but how do you make a relation between your resource(s) and the > script ? You would configure a resource in the Pacemaker configuration, specifying the agent (script), resource parameters,

Re: [ClusterLabs] issue

2020-11-17 Thread Reid Wahl
On Mon, Nov 16, 2020 at 11:23 PM Guy Przytula wrote: > sorry for coming back and thanks for the answers > > but how do you make a relation between your resource(s) and the script ? > > a link to a doc would be nice.. so I do not need to ask questions to the > group.. > Ken provided the

Re: [ClusterLabs] issue

2020-11-17 Thread Klaus Wenninger
On 11/17/20 8:23 AM, Guy Przytula wrote: > > sorry for coming back and thanks for the answers > > but how do you make a relation between your resource(s) and the script ? > Not sure if this really was your question but you might have a look at /usr/lib/ocf/resource.d/heartbeat/... Klaus > > a

Re: [ClusterLabs] issue

2020-11-16 Thread Guy Przytula
sorry for coming back and thanks for the answers but how do you make a relation between your resource(s) and the script ? a link to a doc would be nice..  so I do not need to ask questions to the group.. Best Regards, Beste Groeten, Meilleures Salutations /Guy Przytula/ Tel. GSM : +32

Re: [ClusterLabs] issue

2020-11-16 Thread Ken Gaillot
On Sun, 2020-11-15 at 21:32 -0800, Reid Wahl wrote: > ocf:heartbeat:db2inst isn't a resource agent that ClusterLabs > maintains, so we have no insight into how it works and why it's in > Started state. I don't know based on the output what resource agent > db2_db2ins11_db2ins11_QUERYDB is using. >

Re: [ClusterLabs] issue

2020-11-15 Thread Reid Wahl
ocf:heartbeat:db2inst isn't a resource agent that ClusterLabs maintains, so we have no insight into how it works and why it's in Started state. I don't know based on the output what resource agent db2_db2ins11_db2ins11_QUERYDB is using. I recommend that you take a look at the resource agent's

[ClusterLabs] issue

2020-11-15 Thread Guy Przytula
I have installed latest version of pacemaker on redhat 8 I wanted to test it out for a cluster for IBM Db2 There is only one issue : I have 2 nodes : nodep and nodes the database/instance resource are primary on nodep if I stop the process (db2) on nodes : it is automatically started : ok

Re: [ClusterLabs] Issue in fence_ilo4 with IPv6 ILO IPs

2019-04-08 Thread Rohit Saini
Hi Ondrej, Yes, you are right. This issue was specific to floating IPs, not with local IPs. Post becoming master, I was sending "Neighbor Advertisement" message for my floating IPs. This was a raw message which was created by me, so I was the one who was setting flags in it. Please find attached

Re: [ClusterLabs] Issue in fence_ilo4 with IPv6 ILO IPs

2019-04-08 Thread Ondrej
On 4/5/19 8:18 PM, Rohit Saini wrote: *Further update on this:* This issue is resolved now. ILO was discarding "Neighbor Advertisement" (NA) as Solicited flag was set in NA message. Hence it was not updating its local neighbor table. As per RFC, Solicited flag should be set only in NA message

Re: [ClusterLabs] Issue in fence_ilo4 with IPv6 ILO IPs

2019-04-05 Thread Rohit Saini
*Further update on this:* This issue is resolved now. ILO was discarding "Neighbor Advertisement" (NA) as Solicited flag was set in NA message. Hence it was not updating its local neighbor table. As per RFC, Solicited flag should be set only in NA message when it is a response to Neighbor

Re: [ClusterLabs] Issue in fence_ilo4 with IPv6 ILO IPs

2019-04-05 Thread Rohit Saini
Hi Ondrej, Finally found some lead on this.. We started tcpdump on my machine to understand the IPMI traffic. Attaching the capture for your reference. fd00:1061:37:9021:: is my floating IP and fd00:1061:37:9002:: is my ILO IP. When resource movement happens, we are initiating the "Neighbor

Re: [ClusterLabs] Issue in fence_ilo4 with IPv6 ILO IPs

2019-04-03 Thread Ondrej
On 4/3/19 6:10 PM, Rohit Saini wrote: Hi Ondrej, Please find my reply below: 1. *Stonith configuration:* [root@orana ~]# pcs config  Resource: fence-uc-orana (class=stonith type=fence_ilo4)   Attributes: delay=0 ipaddr=fd00:1061:37:9002:: lanplus=1 login=xyz passwd=xyz pcmk_host_list=orana

Re: [ClusterLabs] Issue with DB2 HADR cluster

2019-04-02 Thread Digimer
On 2019-04-02 1:32 p.m., Andrei Borzenkov wrote: > 02.04.2019 19:32, Dileep V Nair пишет: >> >> >> Hi, >> >> I have a two node DB2 Cluster with pacemaker and HADR. When I issue a >> reboot -f on the node where Primary Database is running, I expect the >> Standby database to be promoted as

Re: [ClusterLabs] Issue with DB2 HADR cluster

2019-04-02 Thread Andrei Borzenkov
02.04.2019 19:32, Dileep V Nair пишет: > > > Hi, > > I have a two node DB2 Cluster with pacemaker and HADR. When I issue a > reboot -f on the node where Primary Database is running, I expect the > Standby database to be promoted as Primary. But what is happening is > pacemaker waits for

Re: [ClusterLabs] Issue in fence_ilo4 with IPv6 ILO IPs

2019-04-01 Thread Rohit Saini
Looking for some help on this. Thanks, Rohit On Thu, Mar 28, 2019 at 11:24 AM Rohit Saini wrote: > Hi All, > I am trying fence_ilo4 with same ILO device having IPv4 and IPv6 address. > I see some discrepancy in both the behaviours: > > *1. When ILO has IPv4 address* > This is working fine and

[ClusterLabs] Issue in fence_ilo4 with IPv6 ILO IPs

2019-03-28 Thread Rohit Saini
Hi All, I am trying fence_ilo4 with same ILO device having IPv4 and IPv6 address. I see some discrepancy in both the behaviours: *1. When ILO has IPv4 address* This is working fine and stonith resources are started immediately. *2. When ILO has IPv6 address* Starting of stonith resources is

Re: [ClusterLabs] Issue on Clusterlabs quickstart guide

2018-06-22 Thread Digimer
On 2018-06-21 03:57 AM, facebook wrote: > Dear All, > > I found a couple of issues in this page: > > https://clusterlabs.org/quickstart-redhat.html > > but I'm not sure if this mailing list is the right place to highlight > it. Sorry if it's not. > > Best Regards, > > Mario.  Here is

[ClusterLabs] Issue on Clusterlabs quickstart guide

2018-06-22 Thread facebook
Dear All, I found a couple of issues in this page: https://clusterlabs.org/quickstart-redhat.html but I'm not sure if this mailing list is the right place to highlight it. Sorry if it's not. Best Regards, Mario. ___ Users mailing

Re: [ClusterLabs] Issue with DRBD + a systemd resource

2017-12-14 Thread Jan Pokorný
On 14/12/17 20:59 +0300, Andrei Borzenkov wrote: > 14.12.2017 19:25, Jan Pokorný пишет: >> On 14/12/17 10:49 -0500, Julien Semaan wrote: >>> Great success! >>> >>> Adding the following line to /usr/lib/systemd/system/pacemaker.service did >>> it: >>> After=dbus.service >> >> Note, this is not a

Re: [ClusterLabs] Issue with DRBD + a systemd resource

2017-12-14 Thread Jan Pokorný
On 14/12/17 17:25 +0100, Jan Pokorný wrote: > Anyway, the change is seemingly straightfoward, but few things > should be answered/investigated first: > - After=dbus.service or rather After=dbus.socket (or both)? In theory, dbus.socket would be more flexible should anyone want to redirect it to a

Re: [ClusterLabs] Issue with DRBD + a systemd resource

2017-12-14 Thread Andrei Borzenkov
14.12.2017 19:25, Jan Pokorný пишет: > On 14/12/17 10:49 -0500, Julien Semaan wrote: >> Great success! >> >> Adding the following line to /usr/lib/systemd/system/pacemaker.service did >> it: >> After=dbus.service > > Note, this is not a proper way for overriding the systemd unit files, > which is

Re: [ClusterLabs] Issue with DRBD + a systemd resource

2017-12-14 Thread Jan Pokorný
On 14/12/17 17:25 +0100, Jan Pokorný wrote: > On 14/12/17 10:49 -0500, Julien Semaan wrote: >> Great success! >> >> Adding the following line to /usr/lib/systemd/system/pacemaker.service did >> it: >> After=dbus.service > > [...] > > Anyway, the change is seemingly straightfoward, but few

Re: [ClusterLabs] Issue with DRBD + a systemd resource

2017-12-14 Thread Julien Semaan
Hi Jan, Good clarification on my ugly systemd unit fix, wrote it a bit too fast and was so happy to finally track that one down. Good to know the bug will get notified through this channel, I'd be happy to provide any details to the maintainers if necessary. Cheers! -- Julien Semaan

Re: [ClusterLabs] Issue with DRBD + a systemd resource

2017-12-14 Thread Jan Pokorný
On 14/12/17 10:49 -0500, Julien Semaan wrote: > Great success! > > Adding the following line to /usr/lib/systemd/system/pacemaker.service did > it: > After=dbus.service Note, this is not a proper way for overriding the systemd unit files, which is rather along the lines: - make a copy to

Re: [ClusterLabs] Issue with DRBD + a systemd resource

2017-12-14 Thread Julien Semaan
Hi Andrei, Great success! Adding the following line to /usr/lib/systemd/system/pacemaker.service did it: After=dbus.service Now, the question is, should the unit file shipped in the RPM be adjusted (currently using CentOS 7), if so, is this the best place to get the message going, or

Re: [ClusterLabs] Issue with DRBD + a systemd resource

2017-12-13 Thread Andrei Borzenkov
Отправлено с iPhone > 13 дек. 2017 г., в 22:53, Julien Semaan написал(а): > > Hello, > > Its my first post on this mailing list so excuse any rookie mistake I may do > in this thread. > > We currently have clusters deployed using corosync/pacemaker that manage DRBD > +

[ClusterLabs] Issue with DRBD + a systemd resource

2017-12-13 Thread Julien Semaan
Hello, Its my first post on this mailing list so excuse any rookie mistake I may do in this thread. We currently have clusters deployed using corosync/pacemaker that manage DRBD + a couple of systemd services. My colleague Derek previously emailed the list about it but has left the

Re: [ClusterLabs] Issue in starting Pacemaker Virtual IP in RHEL 7

2017-11-09 Thread Jan Pokorný
On 06/11/17 10:43 +, Somanath Jeeva wrote: > I am using a two node pacemaker cluster with teaming enabled. The cluster has > > 1. Two team interfaces with different subents. > > 2. The team1 has a NFS VIP plumbed to it. > > 3. The VirtualIP from pacemaker is configured to

[ClusterLabs] Issue in starting Pacemaker Virtual IP in RHEL 7

2017-11-06 Thread Somanath Jeeva
Hi I am using a two node pacemaker cluster with teaming enabled. The cluster has 1. Two team interfaces with different subents. 2. The team1 has a NFS VIP plumbed to it. 3. The VirtualIP from pacemaker is configured to plumb to team0(Corosync ring number is 0) In this case

[ClusterLabs] Issue with attrd_updater hang

2017-01-09 Thread Vladislav Bogdanov
Hi! our customers were hit by a quite strange issue with resources populating attributes in attrd. The most obscure fact is that they see that issue only on a selected subset of nodes (two nodes in a 8-node cluster). Symptoms are sporadic timeouts of resources whose RAs call attrd_updater to

Re: [ClusterLabs] Issue in resource constraints and fencing - RHEL7 - AWS EC2

2016-05-20 Thread Ken Gaillot
On 05/20/2016 10:02 AM, Pratip Ghosh wrote: > Hi All, > > I am implementing 2 node RedHat (RHEL 7.2) HA cluster on Amazon EC2 > instance. For floating IP I am using a shell script provided by AWS so > that virtual IP float to another instance if any one server failed with > health check. In basic

[ClusterLabs] Issue in resource constraints and fencing - RHEL7 - AWS EC2

2016-05-20 Thread Pratip Ghosh
Hi All, I am implementing 2 node RedHat (RHEL 7.2) HA cluster on Amazon EC2 instance. For floating IP I am using a shell script provided by AWS so that virtual IP float to another instance if any one server failed with health check. In basic level cluster is working but I have 2 issues on

Re: [ClusterLabs] Issue with Stonith Resource parameters

2016-03-09 Thread vija ar
here is the config

Re: [ClusterLabs] Issue with Stonith Resource parameters

2016-03-08 Thread emmanuel segura
I think you should give the parameters to the stonith agent, anyway show your config. 2016-03-09 5:29 GMT+01:00 vija ar : > I have configured SLEHA cluster on cisco ucs boxes with ipmi configured, i > have tested IPMI using impitool, however ipmitool to function neatly i have >

[ClusterLabs] Issue with Stonith Resource parameters

2016-03-08 Thread vija ar
I have configured SLEHA cluster on cisco ucs boxes with ipmi configured, i have tested IPMI using impitool, however ipmitool to function neatly i have to pass parameter -y i.e. along with username and password, however to configure stonith there is no parameter in pacemaker to pass ? and due to