from:"Jehan\-Guillaume de Rorthais"

Re: [ClusterLabs] PostgreSQL server timelines offset after promote

2024-03-27 Thread Jehan-Guillaume de Rorthais via Users

Bonjour Thierry,

On Mon, 25 Mar 2024 10:55:06 +
FLORAC Thierry  wrote:

> I'm trying to create a PostgreSQL master/slave cluster using streaming
> replication and pgsqlms agent. Cluster is OK but my problem is this : the
> master node is sometimes restarted for system operations, and the slave is
> then promoted without any problem ; 

When you have to do some planed system operation, you **must** diligently
ask permission to pacemaker. Pacemaker is the real owner of your resource. It
will react to any unexpected event, even if it's a planed one. You must
consider it as a hidden colleague taking care of your resource.

There's various way to deal with Pacemaker when you need to do some system
maintenance on your primary, it depends on your constraints. Here are two
examples:

* ask Pacemaker to move the "promoted" role to another node
* then put the node in standby mode
* then do your admin tasks
* then unstandby your node: a standby should start on the original node
* optional: move back your "promoted" role to original node

Or:

* put the whole cluster in maintenance mode
* then do your admin tasks
* then check everything works as the cluster expect
* then exit the maintenance mode

The second one might be tricky if Pacemaker find some unexpected status/event
when exiting the maintenance mode.

You can find (old) example of administrative tasks in:

* with pcs: https://clusterlabs.github.io/PAF/CentOS-7-admin-cookbook.html
* with crm: https://clusterlabs.github.io/PAF/Debian-8-admin-cookbook.html
* with "low level" commands:
  https://clusterlabs.github.io/PAF/administration.html

These docs updates are long overdue, sorry about that :(

Also, here is an hidden gist (that needs some updates as well):

https://github.com/ClusterLabs/PAF/tree/workshop/docs/workshop/fr

> after reboot, the old master is re-promoted, but I often get an error in
> slave logs :
> 
>   FATAL:  la plus grande timeline 1 du serveur principal est derrière la
>   timeline de restauration 2
> 
> which can be translated in english to :
> 
>   FATAL: the highest timeline 1 of main server is behind restoration timeline
>  2

This is unexpected. I wonder how Pacemaker is being stopped. It is supposed to
stop gracefully its resource. The promotion scores should be updated to reflect
the local resource is not a primary anymore and PostgreSQL should be demoted
then stopped. It is supposed to start as a standby after a graceful shutdown.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] trigger something at ?

2024-02-01 Thread Jehan-Guillaume de Rorthais via Users

On Wed, 31 Jan 2024 18:23:40 +0100
lejeczek via Users  wrote:

> On 31/01/2024 17:13, Jehan-Guillaume de Rorthais wrote:
> > On Wed, 31 Jan 2024 16:37:21 +0100
> > lejeczek via Users  wrote:
> >  
> >>
> >> On 31/01/2024 16:06, Jehan-Guillaume de Rorthais wrote:  
> >>> On Wed, 31 Jan 2024 16:02:12 +0100
> >>> lejeczek via Users  wrote:
> >>>  
> >>>> On 29/01/2024 17:22, Ken Gaillot wrote:  
> >>>>> On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users wrote:  
> >>>>>> Hi guys.
> >>>>>>
> >>>>>> Is it possible to trigger some... action - I'm thinking specifically
> >>>>>> at shutdown/start.
> >>>>>> If not within the cluster then - if you do that - perhaps outside.
> >>>>>> I would like to create/remove constraints, when cluster starts &
> >>>>>> stops, respectively.
> >>>>>>
> >>>>>> many thanks, L.
> >>>>>>  
> >>>>> You could use node status alerts for that, but it's risky for alert
> >>>>> agents to change the configuration (since that may result in more
> >>>>> alerts and potentially some sort of infinite loop).
> >>>>>
> >>>>> Pacemaker has no concept of a full cluster start/stop, only node
> >>>>> start/stop. You could approximate that by checking whether the node
> >>>>> receiving the alert is the only active node.
> >>>>>
> >>>>> Another possibility would be to write a resource agent that does what
> >>>>> you want and order everything else after it. However it's even more
> >>>>> risky for a resource agent to modify the configuration.
> >>>>>
> >>>>> Finally you could write a systemd unit to do what you want and order it
> >>>>> after pacemaker.
> >>>>>
> >>>>> What's wrong with leaving the constraints permanently configured?  
> >>>> yes, that would be for a node start/stop
> >>>> I struggle with using constraints to move pgsql (PAF) master
> >>>> onto a given node - seems that co/locating paf's master
> >>>> results in troubles (replication brakes) at/after node
> >>>> shutdown/reboot (not always, but way too often)  
> >>> What? What's wrong with colocating PAF's masters exactly? How does it
> >>> brake any replication? What's these constraints you are dealing with?
> >>>
> >>> Could you share your configuration?  
> >> Constraints beyond/above of what is required by PAF agent
> >> itself, say...
> >> you have multiple pgSQL cluster with PAF - thus multiple
> >> (separate, for each pgSQL cluster) masters and you want to
> >> spread/balance those across HA cluster
> >> (or in other words - avoid having more that 1 pgsql master
> >> per HA node)  
> > ok
> >  
> >> These below, I've tried, those move the master onto chosen
> >> node but.. then the issues I mentioned.  
> > You just mentioned it breaks the replication, but there so little
> > information about your architecture and configuration, it's impossible to
> > imagine how this could break the replication.
> >
> > Could you add details about the issues ?
> >  
> >> -> $ pcs constraint location PGSQL-PAF-5438-clone prefers  
> >> ubusrv1=1002
> >> or  
> >> -> $ pcs constraint colocation set PGSQL-PAF-5435-clone  
> >> PGSQL-PAF-5434-clone PGSQL-PAF-5433-clone role=Master
> >> require-all=false setoptions score=-1000  
> > I suppose "collocation" constraint is the way to go, not the "location"
> > one.  
> This should be easy to replicate, 3 x VMs, Ubuntu 22.04 in 
> my case

No, this is not easy to replicate. I have no idea how you setup your PostgreSQL
replication, neither I have your full pacemaker configuration.

Please provide either detailed setupS and/or ansible and/or terraform and/or
vagrant, then a detailed scenario showing how it breaks. This is how you can
help and motivate devs to reproduce your issue and work on it.

I will not try to poke around for hours until I find an issue that might not
even be the same than yours.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] trigger something at ?

2024-01-31 Thread Jehan-Guillaume de Rorthais via Users

On Wed, 31 Jan 2024 16:37:21 +0100
lejeczek via Users  wrote:

> 
> 
> On 31/01/2024 16:06, Jehan-Guillaume de Rorthais wrote:
> > On Wed, 31 Jan 2024 16:02:12 +0100
> > lejeczek via Users  wrote:
> >
> >>
> >> On 29/01/2024 17:22, Ken Gaillot wrote:
> >>> On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users wrote:
> >>>> Hi guys.
> >>>>
> >>>> Is it possible to trigger some... action - I'm thinking specifically
> >>>> at shutdown/start.
> >>>> If not within the cluster then - if you do that - perhaps outside.
> >>>> I would like to create/remove constraints, when cluster starts &
> >>>> stops, respectively.
> >>>>
> >>>> many thanks, L.
> >>>>
> >>> You could use node status alerts for that, but it's risky for alert
> >>> agents to change the configuration (since that may result in more
> >>> alerts and potentially some sort of infinite loop).
> >>>
> >>> Pacemaker has no concept of a full cluster start/stop, only node
> >>> start/stop. You could approximate that by checking whether the node
> >>> receiving the alert is the only active node.
> >>>
> >>> Another possibility would be to write a resource agent that does what
> >>> you want and order everything else after it. However it's even more
> >>> risky for a resource agent to modify the configuration.
> >>>
> >>> Finally you could write a systemd unit to do what you want and order it
> >>> after pacemaker.
> >>>
> >>> What's wrong with leaving the constraints permanently configured?
> >> yes, that would be for a node start/stop
> >> I struggle with using constraints to move pgsql (PAF) master
> >> onto a given node - seems that co/locating paf's master
> >> results in troubles (replication brakes) at/after node
> >> shutdown/reboot (not always, but way too often)
> > What? What's wrong with colocating PAF's masters exactly? How does it brake
> > any replication? What's these constraints you are dealing with?
> >
> > Could you share your configuration?
> Constraints beyond/above of what is required by PAF agent 
> itself, say...
> you have multiple pgSQL cluster with PAF - thus multiple 
> (separate, for each pgSQL cluster) masters and you want to 
> spread/balance those across HA cluster
> (or in other words - avoid having more that 1 pgsql master 
> per HA node)

ok

> These below, I've tried, those move the master onto chosen 
> node but.. then the issues I mentioned.

You just mentioned it breaks the replication, but there so little information
about your architecture and configuration, it's impossible to imagine how this
could break the replication.

Could you add details about the issues ?

> -> $ pcs constraint location PGSQL-PAF-5438-clone prefers 
> ubusrv1=1002
> or
> -> $ pcs constraint colocation set PGSQL-PAF-5435-clone 
> PGSQL-PAF-5434-clone PGSQL-PAF-5433-clone role=Master 
> require-all=false setoptions score=-1000

I suppose "collocation" constraint is the way to go, not the "location" one.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] trigger something at ?

2024-01-31 Thread Jehan-Guillaume de Rorthais via Users

On Wed, 31 Jan 2024 16:02:12 +0100
lejeczek via Users  wrote:

> 
> 
> On 29/01/2024 17:22, Ken Gaillot wrote:
> > On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users wrote:
> >> Hi guys.
> >>
> >> Is it possible to trigger some... action - I'm thinking specifically
> >> at shutdown/start.
> >> If not within the cluster then - if you do that - perhaps outside.
> >> I would like to create/remove constraints, when cluster starts &
> >> stops, respectively.
> >>
> >> many thanks, L.
> >>
> > You could use node status alerts for that, but it's risky for alert
> > agents to change the configuration (since that may result in more
> > alerts and potentially some sort of infinite loop).
> >
> > Pacemaker has no concept of a full cluster start/stop, only node
> > start/stop. You could approximate that by checking whether the node
> > receiving the alert is the only active node.
> >
> > Another possibility would be to write a resource agent that does what
> > you want and order everything else after it. However it's even more
> > risky for a resource agent to modify the configuration.
> >
> > Finally you could write a systemd unit to do what you want and order it
> > after pacemaker.
> >
> > What's wrong with leaving the constraints permanently configured?
> yes, that would be for a node start/stop
> I struggle with using constraints to move pgsql (PAF) master 
> onto a given node - seems that co/locating paf's master 
> results in troubles (replication brakes) at/after node 
> shutdown/reboot (not always, but way too often)

What? What's wrong with colocating PAF's masters exactly? How does it brake any
replication? What's these constraints you are dealing with?

Could you share your configuration?
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Beginner lost with promotable "group" design

2024-01-31 Thread Jehan-Guillaume de Rorthais via Users

On Wed, 31 Jan 2024 15:41:28 +0100
Adam Cecile  wrote:
[...]

> Thanks a lot for your suggestion, it seems I have something that work 
> correctly now, final configuration is:

I would recommend configuring in an offline CIB then pushing it to production
as a whole. Eg.:

  # get current CIB
  pcs cluster cib cluster1.xml

  # edit offline CIB
  pcs -f cluster1.xml resource create Internal-IPv4 ocf:heartbeat:IPaddr2 \
ip=10.0.0.254 nic=eth0 cidr_netmask=24 op monitor interval=30

  [...]
  [...]

  pcs -f cluster1.xml stonith create vmfence fence_vmware_rest \
pcmk_host_map="gw-1:gw-1;gw-2:gw-2;gw-3:gw-3" \
ip=10.1.2.3 ssl=1 username=corosync@vsphere.local \
password=p4ssw0rd ssl_insecure=1

  # push offline CIB to production
  pcs cluster cib-push scope=configuration cluster1.xml

> I have a quick one regarding fencing. I disconnected eth0 from gw-3 and 
> the VM has been restarted automatically, so I guess it's the fencing 
> agent that kicked in. However, I left the VM in such state (so it's seen 
> offline by other nodes) and I thought it would end up being powered off 
> for good. However, it seems fencing agent is keeping it powered on. Is 
> that expected ?

The default action of this fencing agent is "reboot". Check the VM uptime to
confirm it has been rebooted. If you want it to shutdown the VM, add
"action=off"

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Planning for Pacemaker 3

2024-01-25 Thread Jehan-Guillaume de Rorthais via Users

On Wed, 24 Jan 2024 16:47:54 -0600
Ken Gaillot  wrote:
...
> > Erm. Well, as this is a major upgrade where we can affect people's
> > conf and
> > break old things & so on, I'll jump in this discussion with a
> > wishlist to
> > discuss :)
> >   
> 
> I made sure we're tracking all these (links below),

Thank you Ken, for creating these tasks. I subscribed to them, but it seems I
can not discuss on them (or maybe I failed to find how to do it).

> but realistically we're going to have our hands full dropping all the
> deprecated stuff in the time we have.

Let me know how I can help on these subject. Also, I'm still silently sitting on
IRC chan if needed.

> Most of these can be done in any version.

Four out of seven can be done in any version. For the three other left, in my
humble opinion and needs from the PAF agent point of view:

1. «Support failure handling of notify actions»
   https://projects.clusterlabs.org/T759
2. «Change allowed range of scores and value of +/-INFINITY»
   https://projects.clusterlabs.org/T756
3. «Default to sending clone notifications when agent supports it»
   https://projects.clusterlabs.org/T758

The first is the most important as it allows to implement an actual election
before the promotion, breaking the current transition if promotion score doesn't
reflect the reality since last monitor action. Current PAF's code makes a lot of
convolution to have a decent election mechanism preventing the promotion of a
lagging node.

The second one would help removing some useless complexity from some resource
agent code (at least in PAF).

The third one is purely for confort and cohesion between actions setup.

Have a good day!

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Planning for Pacemaker 3

2024-01-23 Thread Jehan-Guillaume de Rorthais via Users

Hi there !

On Wed, 03 Jan 2024 11:06:27 -0600
Ken Gaillot  wrote:

> Hi all,
> 
> I'd like to release Pacemaker 3.0.0 around the middle of this year. 
> I'm gathering proposed changes here:
> 
>  https://projects.clusterlabs.org/w/projects/pacemaker/pacemaker_3.0_changes/
> 
> Please review for anything that might affect you, and reply here if you
> have any concerns.

Erm. Well, as this is a major upgrade where we can affect people's conf and
break old things & so on, I'll jump in this discussion with a wishlist to
discuss :)

1. "recover", "migration-to" and "migration-from" actions support ?

  See discussion:
  https://lists.clusterlabs.org/pipermail/developers/2020-February/002258.html

2.1. INT64 promotion scores?
2.2. discovering promotion score ahead of promotion?
2.3. make OCF_RESKEY_CRM_meta_notify_* or equivalent officially available in all
 actions 

  See discussion:
  https://lists.clusterlabs.org/pipermail/developers/2020-February/002255.html

3.1. deprecate "notify=true" clone option, make it true by default
3.2. react to notify action return code

  See discussion:
  https://lists.clusterlabs.org/pipermail/developers/2020-February/002256.html

Off course, I can volunteer to help on some topics.

Cheers!
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] how to colocate promoted resources ?

2023-12-08 Thread Jehan-Guillaume de Rorthais via Users

On Fri, 8 Dec 2023 17:11:58 +0100
lejeczek via Users  wrote:
...
> Apologies, perhaps I was quite vague.
> I was thinking - having a 3-node HA cluster and 3-node 
> single-master->slaves pgSQL, now..
> say, I want pgSQL masters to spread across HA cluster so I 
> theory - having each HA node identical hardware-wise - 
> masters' resources would nicely balance out across HA cluster.

Multi-Primary doesn't exists with (core) PostgreSQL. Unless you want to create
3 different instances, each with their own set of databases and tables, not
replicating with each others?
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] how to colocate promoted resources ?

2023-12-08 Thread Jehan-Guillaume de Rorthais via Users

Hi,

On Wed, 6 Dec 2023 10:36:39 +0100
lejeczek via Users  wrote:

> How do your colocate your promoted resources with balancing 
> underlying resources as priority?

What do you mean?

> With a simple scenario, say
> 3 nodes and 3 pgSQL clusters
> what would be best possible way - I'm thinking most gentle 
> at the same time, if that makes sense.

I'm not sure it answers your question (as I don't understand it), but here is a
doc explaining how to create and move two IP supposed to each start on
secondaries, avoiding the primary node if possible, as long as secondaries nodes
exists

https://clusterlabs.github.io/PAF/CentOS-7-admin-cookbook.html#adding-ips-on-standbys-nodes

++
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF / pgSQL fails after OS/system shutdown - FIX

2023-11-13 Thread Jehan-Guillaume de Rorthais via Users

On Fri, 10 Nov 2023 20:34:40 +0100
lejeczek via Users  wrote:

> On 10/11/2023 18:16, Jehan-Guillaume de Rorthais wrote:
> > On Fri, 10 Nov 2023 17:17:41 +0100
> > lejeczek via Users  wrote:
> >
> > ...  
> >>> Of course you can use "pg_stat_tmp", just make sure the temp folder
> >>> exists:
> >>>
> >>> cat < /etc/tmpfiles.d/postgresql-part.conf
> >>> # Directory for PostgreSQL temp stat files
> >>> d /var/run/postgresql/14-paf.pg_stat_tmp 0700 postgres postgres - -
> >>> EOF
> >>>
> >>> To take this file in consideration immediately without rebooting the
> >>> server, run the following command:
> >>>
> >>> systemd-tmpfiles --create /etc/tmpfiles.d/postgresql-part.conf  
> >> Then there must be something else at play here with Ubuntus,
> >> for none of the nodes has any extra/additional configs for
> >> those paths & I'm sure that those were not created manually.  
> > Indeed.
> >
> > This parameter is usually set by pg_createcluster command and the folder
> > created by both pg_createcluster and pg_ctlcluster commands when needed.
> >
> > This is explained in PAF tutorial there:
> >
> > https://clusterlabs.github.io/PAF/Quick_Start-Debian-10-pcs.html#postgresql-and-cluster-stack-installation
> >
> > These commands comes from the postgresql-common wrapper, used in all Debian
> > related distros, allowing to install, create and use multiple PostgreSQL
> > versions on the same server.
> >  
> >> Perhpaphs pgSQL created these on it's own outside of HA-cluster.  
> > No, the Debian packaging did.
> >
> > Just create the config file I pointed you in my previous answer,
> > systemd-tmpfiles will take care of it and you'll be fine.
> >  
> [...]
> Still that directive for the stats - I wonder if that got 
> introduced, injected somewhere in between because the PG 
> cluster - which I had for a while - did not experience this 
> "issue" from the start and not until "recently"

No, it's a really old behavior of the debian wrapper. 

I documented it for PAF on Debian 8 and 9 in 2017-2018, but it has been there
for 9.5 years:

  
https://salsa.debian.org/postgresql/postgresql-common/-/commit/e83bbefd0d7e87890eee8235476c403fcea50fa8

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] [EXT] Re: PAF / pgSQL fails after OS/system shutdown - FIX

2023-11-13 Thread Jehan-Guillaume de Rorthais via Users

On Mon, 13 Nov 2023 11:39:45 +
"Windl, Ulrich"  wrote:

> But shouldn't the RA check for that (and act appropriately)?

Interesting. I'm open to discuss this. Below my thoughts so far.

Why the RA should check that? There's so many way to setup the system and
PostgreSQL, where should the RA stop checking for all possible way to break it? 

The RA checks various (maybe too many) things related to the instance itself
already.

I know various other PostgreSQL setups that would trigger errors in the cluster
if the dba doesn't check everything is correct. I'm really reluctant to
add add a fair amount of code in the RA to correctly parse and check the
complex PostgreSQL's setup. This would add complexity and bugs. Or maybe I
could add a specific OCF_CHECK_LEVEL sysadmins can trigger by hand before
starting the cluster. But I wonder if it worth the pain, how many people will
know about this and actually run it?

The problem here is that few users actually realize how the postgresql-common
wrapper works and what it actually does behind your back. I really appreciate
this wrapper, I do. But when you setup a Pacemaker cluster, you either have to
bend to it when setting up PAF (as documented), or avoid it completely.

PAF is all about drawing a clear line between the sysadmin job and the
dba one. Dba must build a cluster of instances ready to start/replicate with
standard binaries (not wrappers) before sysadmin can set up the resource in your
cluster.

Thoughts?

> -Original Message-
> From: Users  On Behalf Of Jehan-Guillaume de
> Rorthais via Users Sent: Friday, November 10, 2023 1:13 PM
> To: lejeczek via Users 
> Cc: Jehan-Guillaume de Rorthais 
> Subject: [EXT] Re: [ClusterLabs] PAF / pgSQL fails after OS/system shutdown -
> FIX
> 
> On Fri, 10 Nov 2023 12:27:24 +0100
> lejeczek via Users  wrote:
> ...
> > >
> > to share my "fix" for it - perhaps it was introduced by 
> > OS/packages (Ubuntu 22) updates - ? - as oppose to resource 
> > agent itself.
> > 
> > As the logs point out - pg_stat_tmp - is missing and from 
> > what I see it's only the master, within a cluster, doing 
> > those stats.
> > That appeared, I use the word for I did not put it into 
> > configs, on all nodes.
> > fix = to not use _pg_stat_tmp_ directive/option at all.  
> 
> Of course you can use "pg_stat_tmp", just make sure the temp folder exists:
> 
>   cat < /etc/tmpfiles.d/postgresql-part.conf
>   # Directory for PostgreSQL temp stat files
>   d /var/run/postgresql/14-paf.pg_stat_tmp 0700 postgres postgres - -
>   EOF
> 
> To take this file in consideration immediately without rebooting the server,
> run the following command:
> 
>   systemd-tmpfiles --create /etc/tmpfiles.d/postgresql-part.conf
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF / pgSQL fails after OS/system shutdown - FIX

2023-11-10 Thread Jehan-Guillaume de Rorthais via Users

On Fri, 10 Nov 2023 17:17:41 +0100
lejeczek via Users  wrote:

...
> > Of course you can use "pg_stat_tmp", just make sure the temp folder exists:
> >
> >cat < /etc/tmpfiles.d/postgresql-part.conf
> ># Directory for PostgreSQL temp stat files
> >d /var/run/postgresql/14-paf.pg_stat_tmp 0700 postgres postgres - -
> >EOF
> >
> > To take this file in consideration immediately without rebooting the server,
> > run the following command:
> >
> >systemd-tmpfiles --create /etc/tmpfiles.d/postgresql-part.conf  
> Then there must be something else at play here with Ubuntus, 
> for none of the nodes has any extra/additional configs for 
> those paths & I'm sure that those were not created manually.

Indeed.

This parameter is usually set by pg_createcluster command and the folder
created by both pg_createcluster and pg_ctlcluster commands when needed.

This is explained in PAF tutorial there:

https://clusterlabs.github.io/PAF/Quick_Start-Debian-10-pcs.html#postgresql-and-cluster-stack-installation

These commands comes from the postgresql-common wrapper, used in all Debian
related distros, allowing to install, create and use multiple PostgreSQL
versions on the same server.

> Perhpaphs pgSQL created these on it's own outside of HA-cluster.

No, the Debian packaging did.

Just create the config file I pointed you in my previous answer,
systemd-tmpfiles will take care of it and you'll be fine.

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF / pgSQL fails after OS/system shutdown - FIX

2023-11-10 Thread Jehan-Guillaume de Rorthais via Users

On Fri, 10 Nov 2023 12:27:24 +0100
lejeczek via Users  wrote:
...
> >  
> to share my "fix" for it - perhaps it was introduced by 
> OS/packages (Ubuntu 22) updates - ? - as oppose to resource 
> agent itself.
> 
> As the logs point out - pg_stat_tmp - is missing and from 
> what I see it's only the master, within a cluster, doing 
> those stats.
> That appeared, I use the word for I did not put it into 
> configs, on all nodes.
> fix = to not use _pg_stat_tmp_ directive/option at all.

Of course you can use "pg_stat_tmp", just make sure the temp folder exists:

  cat < /etc/tmpfiles.d/postgresql-part.conf
  # Directory for PostgreSQL temp stat files
  d /var/run/postgresql/14-paf.pg_stat_tmp 0700 postgres postgres - -
  EOF

To take this file in consideration immediately without rebooting the server,
run the following command:

  systemd-tmpfiles --create /etc/tmpfiles.d/postgresql-part.conf
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF / PGSQLMS and wal_level ?

2023-09-13 Thread Jehan-Guillaume de Rorthais via Users

On Wed, 13 Sep 2023 17:32:01 +0200
lejeczek via Users  wrote:

> On 08/09/2023 17:29, Jehan-Guillaume de Rorthais wrote:
> > On Fri, 8 Sep 2023 16:52:53 +0200
> > lejeczek via Users  wrote:
> >  
> >> Hi guys.
> >>
> >> Before I start fiddling and brake things I wonder if
> >> somebody knows if:
> >> pgSQL can work with: |wal_level = archive for PAF ?
> >> Or more general question with pertains to ||wal_level - can
> >> _barman_ be used with pgSQL "under" PAF?  
> > PAF needs "wal_level = replica" (or "hot_standby" on very old versions) so
> > it can have hot standbys where it can connects and query there status.
> >
> > Wal level "replica" includes the archive level, so you can set up archiving.
> >
> > Of course you can use barman or any other tools to manage your PITR Backups,
> > even when Pacemaker/PAF is looking at your instances. This is even the very
> > first step you should focus on during your journey to HA.
> >
> > Regards,  
> and with _barman_ specifically - is one method preferred, 
> recommended over another: streaming VS rsync - for/with PAF?

PAF doesn't need PITR, nor it has preferred a method for it. Both are not
related.

The PITR procedure and tooling you are setting up will help for disaster
recovery, PRA, not PCA. So feel free to choose the ones that help you achieve
**YOUR** RTO/RPO needs in case of disaster.

The only vaguely related subject between PAF and your PITR tooling is how fast
ans easily you'll be able to setup a standby from backups if needed.

Just avoid slots if possible, as WALs could quickly fill your filesystem if
something goes wrong and slots must keep WALs around. If you must use them, set
max_slot_wal_keep_size.

++
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF / PGSQLMS and wal_level ?

2023-09-08 Thread Jehan-Guillaume de Rorthais via Users

On Fri, 8 Sep 2023 16:52:53 +0200
lejeczek via Users  wrote:

> Hi guys.
> 
> Before I start fiddling and brake things I wonder if 
> somebody knows if:
> pgSQL can work with: |wal_level = archive for PAF ?
> Or more general question with pertains to ||wal_level - can 
> _barman_ be used with pgSQL "under" PAF?

PAF needs "wal_level = replica" (or "hot_standby" on very old versions) so it
can have hot standbys where it can connects and query there status.

Wal level "replica" includes the archive level, so you can set up archiving.

Of course you can use barman or any other tools to manage your PITR Backups,
even when Pacemaker/PAF is looking at your instances. This is even the very
first step you should focus on during your journey to HA.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF / PGSQLMS on Ubuntu

2023-09-08 Thread Jehan-Guillaume de Rorthais via Users

On Fri, 8 Sep 2023 10:26:42 +0200
lejeczek via Users  wrote:

> On 07/09/2023 16:20, lejeczek via Users wrote:
> >
> >
> > On 07/09/2023 16:09, Andrei Borzenkov wrote:  
> >> On Thu, Sep 7, 2023 at 5:01 PM lejeczek via Users 
> >>  wrote:  
> >>> Hi guys.
> >>>
> >>> I'm trying to set ocf_heartbeat_pgsqlms agent but I get:
> >>> ...
> >>> Failed Resource Actions:
> >>>    * PGSQL-PAF-5433 stop on ubusrv3 returned 'invalid 
> >>> parameter' because 'Parameter "recovery_target_timeline" 
> >>> MUST be set to 'latest'. It is currently set to ''' at 
> >>> Thu Sep  7 13:58:06 2023 after 54ms
> >>>
> >>> I'm new to Ubuntu and I see that Ubuntu has a bit 
> >>> different approach to paths (in comparison to how Centos 
> >>> do it).
> >>> I see separation between config & data, eg.
> >>>
> >>> 14  paf 5433 down   postgres 
> >>> /var/lib/postgresql/14/paf 
> >>> /var/log/postgresql/postgresql-14-paf.log
> >>>
> >>> I create the resource like here:
> >>>  
> >>> -> $ pcs resource create PGSQL-PAF-5433   
> >>> ocf:heartbeat:pgsqlms pgport=5433 bindir=/usr/bin 
> >>> pgdata=/etc/postgresql/14/paf 
> >>> datadir=/var/lib/postgresql/14/paf meta 
> >>> failure-timeout=30s master-max=1 op start timeout=60s op 
> >>> stop timeout=60s op promote timeout=30s op demote 
> >>> timeout=120s op monitor interval=15s timeout=10s 
> >>> role="Promoted" op monitor interval=16s timeout=10s 
> >>> role="Unpromoted" op notify timeout=60s promotable 
> >>> notify=true failure-timeout=30s master-max=1 --disable
> >>>
> >>> Ubuntu 22.04.3 LTS
> >>> What am I missing can you tell?  
> >> Exactly what the message tells you. You need to set 
> >> recovery_target=latest.  
> > and having it in 'postgresql.conf' make it all work for you?
> > I've had it and got those errors - perhaps that has to be 
> > set some place else.
> >  
> In case anybody was in this situation - I was missing one 
> important bit: _bindir_
> Ubuntu's pgSQL binaries have different path - what 
> resource/agent returns as errors is utterly confusing.

Uh ? Good point, I'll definitely have to check that.

Thanks for the report!
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] OCF_HEARTBEAT_PGSQL - any good with current Postgres- ?

2023-05-05 Thread Jehan-Guillaume de Rorthais via Users

On Fri, 5 May 2023 10:08:17 +0200
lejeczek via Users  wrote:

> On 25/04/2023 14:16, Jehan-Guillaume de Rorthais wrote:
> > Hi,
> >
> > On Mon, 24 Apr 2023 12:32:45 +0200
> > lejeczek via Users  wrote:
> >  
> >> I've been looking up and fiddling with this RA but
> >> unsuccessfully so far, that I wonder - is it good for
> >> current versions of pgSQLs?  
> > As far as I know, the pgsql agent is still supported, last commit on it
> > happen in Jan 11th 2023. I don't know about its compatibility with latest
> > PostgreSQL versions.
> >
> > I've been testing it many years ago, I just remember it was quite hard to
> > setup, understand and manage from the maintenance point of view.
> >
> > Also, this agent is fine in a shared storage setup where it only
> > start/stop/monitor the instance, without paying attention to its role
> > (promoted or not).
> >  
> It's not only that it's hard - which is purely due to 
> piss-poor man page in my opinion - but it really sounds 
> "expired".

I really don't know. My feeling is that the manpage might be expired, which
really doesn't help with this agent, but not the RA itself.

> Eg. man page speaks of 'recovery.conf' which - as I 
> understand it - newer/current versions of pgSQL do not! even 
> use... which makes one wonder.

This has been fixed in late 2019, but with no documentation associated :/
See:
https://github.com/ClusterLabs/resource-agents/commit/a43075be72683e1d4ddab700ec16d667164d359c

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] OCF_HEARTBEAT_PGSQL - any good with current Postgres- ?

2023-04-25 Thread Jehan-Guillaume de Rorthais via Users

Hi,

On Mon, 24 Apr 2023 12:32:45 +0200
lejeczek via Users  wrote:

> I've been looking up and fiddling with this RA but 
> unsuccessfully so far, that I wonder - is it good for 
> current versions of pgSQLs?

As far as I know, the pgsql agent is still supported, last commit on it happen
in Jan 11th 2023. I don't know about its compatibility with latest PostgreSQL
versions.

I've been testing it many years ago, I just remember it was quite hard to
setup, understand and manage from the maintenance point of view.

Also, this agent is fine in a shared storage setup where it only
start/stop/monitor the instance, without paying attention to its role (promoted
or not).

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] [PaceMaker] Help troubleshooting frequent disjoin issue

2023-03-21 Thread Jehan-Guillaume de Rorthais via Users

On Tue, 21 Mar 2023 11:47:23 +0100
Jérôme BECOT  wrote:

> Le 21/03/2023 à 11:00, Jehan-Guillaume de Rorthais a écrit :
> > Hi,
> >
> > On Tue, 21 Mar 2023 09:33:04 +0100
> > Jérôme BECOT  wrote:
> >  
> >> We have several clusters running for different zabbix components. Some
> >> of these clusters consist of 2 zabbix proxies,where nodes run Mysql,
> >> Zabbix-proxy server and a VIP, and a corosync-qdevice.  
> > I'm not sure to understand your topology. The corosync-device is not
> > supposed to be on a cluster node. It is supposed to be on a remote node and
> > provide some quorum features to one or more cluster without setting up the
> > whole pacemaker/corosync stack.  
> I was not clear, the qdevice is deployed on a remote node, as intended.

ok

> >> The MySQL servers are always up to replicate, and are configured in
> >> Master/Master (they both replicate from the other but only one is supposed
> >> to be updated by the proxy running on the master node).  
> > Why do you bother with Master/Master when a simple (I suppose, I'm not a
> > MySQL cluster guy) Primary-Secondary topology or even a shared storage
> > would be enough and would keep your logic (writes on one node only) safe
> > from incidents, failures, errors, etc?
> >
> > HA must be a simple as possible. Remove useless parts when you can.  
> A shared storage moves the complexity somewhere else.

Yes, on storage/SAN side.

> A classic Primary / secondary can be an option if PaceMaker manages to start
> the client on the slave node,

I suppose this can be done using a location constraint.

> but it would become Master/Master during the split brain.

No, and if you do have real split brain, then you might have something wrong in
your setup. See below.


> >> One cluster is prompt to frequent sync errors, with duplicate entries
> >> errors in SQL. When I look at the logs, I can see "Mar 21 09:11:41
> >> zabbix-proxy-01 pacemaker-controld  [948] (pcmk_cpg_membership)
> >> info: Group crmd event 89: zabbix-proxy-02 (node 2 pid 967) left via
> >> cluster exit", and within the next second, a rejoin. The same messages
> >> are in the other node logs, suggesting a split brain, which should not
> >> happen, because there is a quorum device.  
> > Would it be possible your SQL sync errors and the left/join issues are
> > correlated and are both symptoms of another failure? Look at your log for
> > some explanation about why the node decided to leave the cluster.  
> 
> My guess is that maybe a high latency in network cause the disjoin, 
> hence starting Zabbix-proxy on both nodes causes the replication error. 
> It is configured to use the vip which is up locally because there is a 
> split brain.

If you have a split brain, that means your quorum setup is failing. 

No node could start/promote a resource without having the quorum. If a node is
isolated from the cluster and quorum-device, it should stop its resources, not
recover/promote them.

If both nodes lost connection with each others, but are still connected to the
quorum-device, the later should be able to grant the quorum on one side only.

Lastly, quorum is a split brain protection when "things are going fine".
Fencing is a split brain protection for all other situations. Fencing is hard
and painful, but it saves from many split brain situation.

> This is why I'm requesting guidance to check/monitor these nodes to find 
> out if it is temporary network latency that is causing the disjoin.

A cluster is always very sensitive to network latency/failures. You need to
build on stronger fondations.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] [PaceMaker] Help troubleshooting frequent disjoin issue

2023-03-21 Thread Jehan-Guillaume de Rorthais via Users

Hi,

On Tue, 21 Mar 2023 09:33:04 +0100
Jérôme BECOT  wrote:

> We have several clusters running for different zabbix components. Some 
> of these clusters consist of 2 zabbix proxies,where nodes run Mysql, 
> Zabbix-proxy server and a VIP, and a corosync-qdevice. 

I'm not sure to understand your topology. The corosync-device is not supposed
to be on a cluster node. It is supposed to be on a remote node and provide some
quorum features to one or more cluster without setting up the whole
pacemaker/corosync stack.

> The MySQL servers are always up to replicate, and are configured in
> Master/Master (they both replicate from the other but only one is supposed to
> be updated by the proxy running on the master node).

Why do you bother with Master/Master when a simple (I suppose, I'm not a MySQL
cluster guy) Primary-Secondary topology or even a shared storage would be
enough and would keep your logic (writes on one node only) safe from incidents,
failures, errors, etc?

HA must be a simple as possible. Remove useless parts when you can.

> One cluster is prompt to frequent sync errors, with duplicate entries 
> errors in SQL. When I look at the logs, I can see "Mar 21 09:11:41 
> zabbix-proxy-01 pacemaker-controld  [948] (pcmk_cpg_membership)     
> info: Group crmd event 89: zabbix-proxy-02 (node 2 pid 967) left via 
> cluster exit", and within the next second, a rejoin. The same messages 
> are in the other node logs, suggesting a split brain, which should not 
> happen, because there is a quorum device.

Would it be possible your SQL sync errors and the left/join issues are
correlated and are both symptoms of another failure? Look at your log for some
explanation about why the node decided to leave the cluster.

> Can you help me to troubleshoot this ? I can provide any 
> log/configuration required in the process, so let me know.
> 
> I'd also like to ask if there is a bit of configuration that can be done 
> to postpone service start on the other node for two or three seconds as 
> a quick workaround ?

How would it be a workaround?

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] [External] : Reload DNSMasq after IPAddr2 change ?

2023-02-10 Thread Jehan-Guillaume de Rorthais via Users

Hi,

What about using the Dummy resource agent (ocf_heartbeat_dummy(7)) and collocate
it with your IP address? This RA creates a local file on start and removes it
on stop. The game now is to watch for this path from a systemd path unit and
trigger the reload when file appears. See systemd.path(5).

There might be multiple other ways to trigger such a reload using various
strategies and hack relying on systemd or even DBus events...

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Resource validation [was: multiple resources - pgsqlms - and IP(s)]

2023-01-09 Thread Jehan-Guillaume de Rorthais via Users

Hi,

I definitely have some work/improvements to do on the pgsqlms agent, but
there's still some details I'm interested to discuss below.

On Fri, 6 Jan 2023 16:36:19 -0800
Reid Wahl  wrote:

> On Fri, Jan 6, 2023 at 3:26 PM Jehan-Guillaume de Rorthais via Users
>  wrote:
>>
>> On Wed, 4 Jan 2023 11:15:06 +0100
>> Tomas Jelinek  wrote:
>>  
>>> Dne 04. 01. 23 v 8:29 Reid Wahl napsal(a):  
>>>> On Tue, Jan 3, 2023 at 10:53 PM lejeczek via Users
>>>>  wrote:  
>>>>>
>>>>> On 03/01/2023 21:44, Ken Gaillot wrote:  
>>>>>> On Tue, 2023-01-03 at 18:18 +0100, lejeczek via Users wrote:  
[...]
>>>>>>> Not related - Is this an old bug?:
>>>>>>>  
>>>>>>> -> $ pcs resource create pgsqld-apps ocf:heartbeat:pgsqlms  
>>>>>>> bindir=/usr/bin pgdata=/apps/pgsql/data op start timeout=60s
>>>>>>> op stop timeout=60s op promote timeout=30s op demote
>>>>>>> timeout=120s op monitor interval=15s timeout=10s
>>>>>>> role="Master" op monitor interval=16s timeout=10s
>>>>>>> role="Slave" op notify timeout=60s meta promotable=true
>>>>>>> notify=true master-max=1 --disable
>>>>>>> Error: Validation result from agent (use --force to override):
>>>>>>>  ocf-exit-reason:You must set meta parameter notify=true
>>>>>>> for your master resource
>>>>>>> Error: Errors have occurred, therefore pcs is unable to continue  
>>>>>> pcs now runs an agent's validate-all action before creating a
>>>>>> resource. In this case it's detecting a real issue in your command.
>>>>>> The options you have after "meta" are clone options, not meta options
>>>>>> of the resource being cloned. If you just change "meta" to "clone" it
>>>>>> should work.  
>>>>> Nope. Exact same error message.
>>>>> If I remember correctly there was a bug specifically
>>>>> pertained to 'notify=true'  
>>>>
>>>> The only recent one I can remember was a core dump.
>>>> - Bug 2039675 - pacemaker coredump with ocf:heartbeat:mysql resource
>>>> (https://bugzilla.redhat.com/show_bug.cgi?id=2039675)
>>>>
>>>>  From a quick inspection of the pcs resource validation code
>>>> (lib/pacemaker/live.py:validate_resource_instance_attributes_via_pcmk()),
>>>> it doesn't look like it passes the meta attributes. It only passes the
>>>> instance attributes. (I could be mistaken.)
>>>>
>>>> The pgsqlms resource agent checks the notify meta attribute's value as
>>>> part of the validate-all action. If pcs doesn't pass the meta
>>>> attributes to crm_resource, then the check will fail.
>>>
>>> Pcs cannot pass meta attributes to crm_resource, because there is
>>> nowhere to pass them to.  
>>
>> But, they are passed as environment variable by Pacemaker, why pcs couldn't
>> set them as well when running the agent?  
> 
> pcs uses crm_resource to run the validate-all action. crm_resource
> doesn't provide a way to pass in meta attributes -- only instance
> attributes. Whether crm_resource should provide that is another
> question...

But crm_resource can set them as environment variable, they are inherited to
the resource agent when executing it:

  # This fails
  # crm_resource --validate   \
 --class ocf --agent pgsqlms --provider heartbeat \
 --option pgdata=/var/lib/pgsql/15/data   \
 --option bindir=/usr/pgsql-15/bin
  Operation validate (ocf:heartbeat:pgsqlms) returned 5 (not installed: 
You must set meta parameter notify=true for your "master" resource)
  ocf-exit-reason:You must set meta parameter notify=true for your "master" 
resource
  crm_resource: Error performing operation: Not installed

  # This fails on a different mandatory setup
  # OCF_RESKEY_CRM_meta_notify=1  \
crm_resource --validate   \
 --class ocf --agent pgsqlms --provider heartbeat \
 --option pgdata=/var/lib/pgsql/15/data   \
 --option bindir=/usr/pgsql-15/bin
  Operation validate (ocf:heartbeat:pgsqlms) returned 5 (not installed:
You must set meta parameter master-max=1 for your "master" resource)
  ocf-exit-reason:You must set meta paramete

Re: [ClusterLabs] multiple resources - pgsqlms - and IP(s)

2023-01-06 Thread Jehan-Guillaume de Rorthais via Users

On Wed, 4 Jan 2023 11:15:06 +0100
Tomas Jelinek  wrote:

> Dne 04. 01. 23 v 8:29 Reid Wahl napsal(a):
> > On Tue, Jan 3, 2023 at 10:53 PM lejeczek via Users
> >  wrote:  
> >>
> >>
> >>
> >> On 03/01/2023 21:44, Ken Gaillot wrote:  
> >>> On Tue, 2023-01-03 at 18:18 +0100, lejeczek via Users wrote:  
> >>>> On 03/01/2023 17:03, Jehan-Guillaume de Rorthais wrote:  
> >>>>> Hi,
> >>>>>
> >>>>> On Tue, 3 Jan 2023 16:44:01 +0100
> >>>>> lejeczek via Users  wrote:
> >>>>>  
> >>>>>> To get/have Postgresql cluster with 'pgsqlms' resource, such
> >>>>>> cluster needs a 'master' IP - what do you guys do when/if
> >>>>>> you have multiple resources off this agent?
> >>>>>> I wonder if it is possible to keep just one IP and have all
> >>>>>> those resources go to it - probably 'scoring' would be very
> >>>>>> tricky then, or perhaps not?  
> >>>>> That would mean all promoted pgsql MUST be on the same node at any
> >>>>> time.
> >>>>> If one of your instance got some troubles and need to failover,
> >>>>> *ALL* of them
> >>>>> would failover.
> >>>>>
> >>>>> This imply not just a small failure time window for one instance,
> >>>>> but for all
> >>>>> of them, all the users.
> >>>>>  
> >>>>>> Or you do separate IP for each 'pgsqlms' resource - the
> >>>>>> easiest way out?  
> >>>>> That looks like a better option to me, yes.
> >>>>>
> >>>>> Regards,  
> >>>> Not related - Is this an old bug?:
> >>>>  
> >>>> -> $ pcs resource create pgsqld-apps ocf:heartbeat:pgsqlms  
> >>>> bindir=/usr/bin pgdata=/apps/pgsql/data op start timeout=60s
> >>>> op stop timeout=60s op promote timeout=30s op demote
> >>>> timeout=120s op monitor interval=15s timeout=10s
> >>>> role="Master" op monitor interval=16s timeout=10s
> >>>> role="Slave" op notify timeout=60s meta promotable=true
> >>>> notify=true master-max=1 --disable
> >>>> Error: Validation result from agent (use --force to override):
> >>>>  ocf-exit-reason:You must set meta parameter notify=true
> >>>> for your master resource
> >>>> Error: Errors have occurred, therefore pcs is unable to continue  
> >>> pcs now runs an agent's validate-all action before creating a resource.
> >>> In this case it's detecting a real issue in your command. The options
> >>> you have after "meta" are clone options, not meta options of the
> >>> resource being cloned. If you just change "meta" to "clone" it should
> >>> work.  
> >> Nope. Exact same error message.
> >> If I remember correctly there was a bug specifically
> >> pertained to 'notify=true'  
> > 
> > The only recent one I can remember was a core dump.
> > - Bug 2039675 - pacemaker coredump with ocf:heartbeat:mysql resource
> > (https://bugzilla.redhat.com/show_bug.cgi?id=2039675)
> > 
> >  From a quick inspection of the pcs resource validation code
> > (lib/pacemaker/live.py:validate_resource_instance_attributes_via_pcmk()),
> > it doesn't look like it passes the meta attributes. It only passes the
> > instance attributes. (I could be mistaken.)
> > 
> > The pgsqlms resource agent checks the notify meta attribute's value as
> > part of the validate-all action. If pcs doesn't pass the meta
> > attributes to crm_resource, then the check will fail.
> >   
> 
> Pcs cannot pass meta attributes to crm_resource, because there is 
> nowhere to pass them to.

But, they are passed as environment variable by Pacemaker, why pcs couldn't set
them as well when running the agent?

> As defined in OCF 1.1, only instance attributes 
> matter for validation, see 
> https://github.com/ClusterLabs/OCF-spec/blob/main/ra/1.1/resource-agent-api.md#check-levels

It doesn't state clearly that meta attributes must be ignored by the agent
during these actions.

And one could argue checking a meta attribute is a purely internal setup check,
at level 0.

> The agents are bugged - they depend on meta data being passed to 
> validation. This is already tracked and being worked on:
> 
> https://github.com/ClusterLabs/resource-agents/pull/1826

The pgsqlms resource agent checks the OCF_RESKEY_CRM_meta_notify environment
variable before raising this error.

The pgsqlms resource agent is relying on notify action to make some important
checks and actions. Without notifies, the resource will just behave wrongly.
This is an essential check.

However, I've been considering moving some of these checks only during the
probe action. Would it make sense? The notify check could move there as there's
no need to check it on a regular basis.

Thanks,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] multiple resources - pgsqlms - and IP(s)

2023-01-03 Thread Jehan-Guillaume de Rorthais via Users

Hi,

On Tue, 3 Jan 2023 16:44:01 +0100
lejeczek via Users  wrote:

> To get/have Postgresql cluster with 'pgsqlms' resource, such 
> cluster needs a 'master' IP - what do you guys do when/if 
> you have multiple resources off this agent?
> I wonder if it is possible to keep just one IP and have all 
> those resources go to it - probably 'scoring' would be very 
> tricky then, or perhaps not?

That would mean all promoted pgsql MUST be on the same node at any time.
If one of your instance got some troubles and need to failover, *ALL* of them
would failover.

This imply not just a small failure time window for one instance, but for all
of them, all the users.

> Or you do separate IP for each 'pgsqlms' resource - the 
> easiest way out?

That looks like a better option to me, yes.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-07 Thread Jehan-Guillaume de Rorthais via Users

On Mon, 7 Nov 2022 14:06:51 +
Robert Hayden  wrote:

> > -Original Message-
> > From: Users  On Behalf Of Valentin Vidic
> > via Users
> > Sent: Sunday, November 6, 2022 5:20 PM
> > To: users@clusterlabs.org
> > Cc: Valentin Vidić 
> > Subject: Re: [ClusterLabs] [External] : Re: Fence Agent tests
> > 
> > On Sun, Nov 06, 2022 at 09:08:19PM +, Robert Hayden wrote:  
> > > When SBD_PACEMAKER was set to "yes", the lack of network connectivity  
> > to the node  
> > > would be seen and acted upon by the remote nodes (evicts and takes
> > > over ownership of the resources).  But the impacted node would just
> > > sit logging IO errors.  Pacemaker would keep updating the /dev/watchdog
> > > device so SBD would not self evict.   Once I re-enabled the network, then
> > >  
> > the
> > 
> > Interesting, not sure if this is the expected behaviour based on:
> > 
> > https://urldefense.com/v3/__https://lists.clusterlabs.org/pipermail/users/2
> > 017-
> > August/022699.html__;!!ACWV5N9M2RV99hQ!IvnnhGI1HtTBGTKr4VFabWA
> > LeMfBWNhcS0FHsPFHwwQ3Riu5R3pOYLaQPNia-
> > GaB38wRJ7Eq4Q3GyT5C3s8y7w$
> > 
> > Does SBD log "Majority of devices lost - surviving on pacemaker" or
> > some other messages related to Pacemaker?  
> 
> Yes.
> 
> > 
> > Also what is the status of Pacemaker when the network is down? Does it
> > report no quorum or something else?
> >   
> 
> Pacemaker on the failing node shows quorum even though it has lost 
> communication to the Quorum Device and to the other node in the cluster.

This is the main issue. Maybe inspecting the corosync-cmapctl output could shed
some lights on some setup we are missing?

> The non-failing node of the cluster can see the Quorum Device system and 
> thus correctly determines to fence the failing node and take over its 
> resources.

Normal.

> Only after I run firewall-cmd --panic-off, will the failing node start to log
> messages about loss of TOTEM and getting a new consensus with the 
> now visible members.
> 
> I think all of that explains the lack of self-fencing when the sbd setting of
> SBD_PACEMAKER=yes is used.

I'm not sure. If I understand correctly, SBD_PACEMAKER=yes only instruct sbd to
keep an eye on the pacemaker+corosync processes (as described up thread). It
doesn't explain why Pacemaker keeps holding the quorum, but I might miss
something...
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-05 Thread Jehan-Guillaume de Rorthais via Users

On Sat, 5 Nov 2022 20:54:55 +
Robert Hayden  wrote:

> > -Original Message-
> > From: Jehan-Guillaume de Rorthais 
> > Sent: Saturday, November 5, 2022 3:45 PM
> > To: users@clusterlabs.org
> > Cc: Robert Hayden 
> > Subject: Re: [ClusterLabs] [External] : Re: Fence Agent tests
> > 
> > On Sat, 5 Nov 2022 20:53:09 +0100
> > Valentin Vidić via Users  wrote:
> >   
> > > On Sat, Nov 05, 2022 at 06:47:59PM +, Robert Hayden wrote:  
> > > > That was my impression as well...so I may have something wrong.  My
> > > > expectation was that SBD daemon should be writing to the  
> > /dev/watchdog  
> > > > within 20 seconds and the kernel watchdog would self fence.  
> > >
> > > I don't see anything unusual in the config except that pacemaker mode is
> > > also enabled. This means that the cluster is providing signal for sbd even
> > > when the storage device is down, for example:
> > >
> > > 883 ?SL 0:00 sbd: inquisitor
> > > 892 ?SL 0:00  \_ sbd: watcher: /dev/vdb1 - slot: 0 - uuid: ...
> > > 893 ?SL 0:00  \_ sbd: watcher: Pacemaker
> > > 894 ?SL 0:00  \_ sbd: watcher: Cluster
> > >
> > > You can strace different sbd processes to see what they are doing at any
> > > point.  
> > 
> > I suspect both watchers should detect the loss of network/communication
> > with
> > the other node.
> > 
> > BUT, when sbd is in Pacemaker mode, it doesn't reset the node if the
> > local **Pacemaker** is still quorate (via corosync). See the full chapter:
> > «If Pacemaker integration is activated, SBD will not self-fence if
> > **device** majority is lost [...]»
> > https://urldefense.com/v3/__https://documentation.suse.com/sle-ha/15-
> > SP4/html/SLE-HA-all/cha-ha-storage-
> > protect.html__;!!ACWV5N9M2RV99hQ!LXxpjg0QHdAP0tvr809WCErcpPH0lx
> > MKesDNqK-PU_Xpvb_KIGlj3uJcVLIbzQLViOi3EiSV3bkPUCHr$
> > 
> > Would it be possible that no node is shutting down because the cluster is in
> > two-node mode? Because of this mode, both would keep the quorum
> > expecting the
> > fencing to kill the other one... Except there's no active fencing here, only
> > "self-fencing".
> >   
> 
> I failed to mention I also have a Quorum Device also setup to add its vote to
> the quorum. So two_node is not enabled. 

oh, ok.

> I suspect Valentin was onto to something with pacemaker keeping the watchdog
> device updated as it thinks the cluster is ok.  Need to research and test
> that theory out.  I will try to carve some time out next week for that.

AFAIK, Pacemaker strictly rely on SBD to deal with the watchdog. It doesn't feed
it by itself.

In Pacemaker mode, SBD is watching the two most important part of the cluster:
Pacemaker and Corosync:

* the "Pacemaker watcher" of SBD connects to the CIB and check it's still
  updated on a regular basis and the self-node is marked online.
* the "Cluster watchers" all connect with each others using a dedicated
  communication group in corosync ring(s).

Both watchers can report a failure to SBD that would self-stop the node.

If the network if down, I suppose the cluster watcher should complain. But I
suspect Pacemaker somehow keeps reporting as quorate, thus, forbidding SBD to
kill the whole node...

> Appreciate all of the feedback.  I have been dealing with Cluster Suite for a
> decade+ but focused on the company's setup.  I still have lots to learn,
> which keeps me interested.

+1

Keep us informed!

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-05 Thread Jehan-Guillaume de Rorthais via Users

On Sat, 5 Nov 2022 20:53:09 +0100
Valentin Vidić via Users  wrote:

> On Sat, Nov 05, 2022 at 06:47:59PM +, Robert Hayden wrote:
> > That was my impression as well...so I may have something wrong.  My
> > expectation was that SBD daemon should be writing to the /dev/watchdog
> > within 20 seconds and the kernel watchdog would self fence.  
> 
> I don't see anything unusual in the config except that pacemaker mode is
> also enabled. This means that the cluster is providing signal for sbd even
> when the storage device is down, for example:
> 
> 883 ?SL 0:00 sbd: inquisitor
> 892 ?SL 0:00  \_ sbd: watcher: /dev/vdb1 - slot: 0 - uuid: ...
> 893 ?SL 0:00  \_ sbd: watcher: Pacemaker
> 894 ?SL 0:00  \_ sbd: watcher: Cluster
> 
> You can strace different sbd processes to see what they are doing at any
> point.

I suspect both watchers should detect the loss of network/communication with
the other node.

BUT, when sbd is in Pacemaker mode, it doesn't reset the node if the
local **Pacemaker** is still quorate (via corosync). See the full chapter:
«If Pacemaker integration is activated, SBD will not self-fence if **device**
majority is lost [...]»
https://documentation.suse.com/sle-ha/15-SP4/html/SLE-HA-all/cha-ha-storage-protect.html

Would it be possible that no node is shutting down because the cluster is in
two-node mode? Because of this mode, both would keep the quorum expecting the
fencing to kill the other one... Except there's no active fencing here, only
"self-fencing".

To verify this guess, check the corosync conf for the "two_node" parameter and
if both nodes still report as quorate during network outage using:

  corosync-quorumtool -s

If this turn to be a good guess, without **active** fencing, I suppose a cluster
can not rely on the two-node mode. I'm not sure what would be the best setup
though.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Corosync over dedicated interface?

2022-10-03 Thread Jehan-Guillaume de Rorthais via Users

On Mon, 3 Oct 2022 14:45:49 +0200
Tomas Jelinek  wrote:

> Dne 28. 09. 22 v 18:22 Jehan-Guillaume de Rorthais via Users napsal(a):
> > Hi,
> > 
> > A small addendum below.
> > 
> > On Wed, 28 Sep 2022 11:42:53 -0400
> > "Kevin P. Fleming"  wrote:
> >   
> >> On Wed, Sep 28, 2022 at 11:37 AM Dave Withheld 
> >> wrote:  
> >>>
> >>> Is it possible to get corosync to use the private network and stop trying
> >>> to use the LAN for cluster communications? Or am I totally off-base and am
> >>> missing something in my drbd/pacemaker configuration?  
> >>
> >> Absolutely! When I setup my two-node cluster recently I did exactly
> >> that. If you are using 'pcs' to manage your cluster, ensure that you
> >> add the 'addr=' parameter during 'pcs host auth' so that Corosync and
> >> the layers above it will use that address for the host. Something
> >> like:
> >>
> >> $ pcs host auth cluster-node-1 addr=192.168.10.1 cluster-node-2
> >> addr=192.168.10.2  
> > 
> > You can even set multiple rings so corosync can rely on both:
> > 
> >$ pcs host auth\
> >  cluster-node-1 addr=192.168.10.1 addr=10.20.30.1 \
> >  cluster-node-2 addr=192.168.10.2 addr=10.20.30.2  
> 
> Hi,
> 
> Just a little correction.
> 
> The 'pcs host auth' command accepts only one addr= for each node. The 
> address will be then used for pcs communication. If you don't put any 
> addr= in the 'pcs cluster setup' command, it will be used for corosync 
> communication as well.
> 
> However, if you want to set corosync to use multiple rings, you do that 
> by specifying addr= in the 'pcs cluster setup' command like this:
> 
> pcs cluster setup cluster_name \
> cluster-node-1 addr=192.168.10.1 addr=10.20.30.1 \
> cluster-node-2 addr=192.168.10.2 addr=10.20.30.2
> 
> If you used addr= in the 'pcs host auth' command and you want the same 
> address to be used by corosync, you need to specify that address in the 
> 'pcs cluster setup' command. If you only specify the second address, 
> you'll end up with a one-ring cluster.

Oops! I mixed commands up, sorry!

Thank you Tomas.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Corosync over dedicated interface?

2022-09-28 Thread Jehan-Guillaume de Rorthais via Users

Hi,

A small addendum below.

On Wed, 28 Sep 2022 11:42:53 -0400
"Kevin P. Fleming"  wrote:

> On Wed, Sep 28, 2022 at 11:37 AM Dave Withheld 
> wrote:
> >
> > Is it possible to get corosync to use the private network and stop trying
> > to use the LAN for cluster communications? Or am I totally off-base and am
> > missing something in my drbd/pacemaker configuration?  
> 
> Absolutely! When I setup my two-node cluster recently I did exactly
> that. If you are using 'pcs' to manage your cluster, ensure that you
> add the 'addr=' parameter during 'pcs host auth' so that Corosync and
> the layers above it will use that address for the host. Something
> like:
> 
> $ pcs host auth cluster-node-1 addr=192.168.10.1 cluster-node-2
> addr=192.168.10.2

You can even set multiple rings so corosync can rely on both:

  $ pcs host auth\
cluster-node-1 addr=192.168.10.1 addr=10.20.30.1 \
cluster-node-2 addr=192.168.10.2 addr=10.20.30.2

Then, compare (but do not edit!) your "/etc/corosync/corosync.conf" on all
nodes.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] DRBD and SQL Server

2022-09-28 Thread Jehan-Guillaume de Rorthais via Users

On Wed, 28 Sep 2022 02:33:59 -0400
Madison Kelly  wrote:

> ...
> I'm happy to go into more detail, but I'll stop here until/unless you have
> more questions. Otherwise I'd write a book. :)

I would buy it ;)
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] (no subject)

2022-09-07 Thread Jehan-Guillaume de Rorthais via Users

Hey,

On Wed, 7 Sep 2022 19:12:53 +0900
권오성  wrote:

> Hello.
> I am a student who wants to implement a redundancy system with raspberry pi.
> Last time, I posted about how to proceed with installation on raspberry pi
> and received a lot of comments.
> Among them, I searched a lot after looking at the comments saying that
> fencing stonith should not be false.
> (ex -> sudo pcs property set stonith-enabled=false)
> However, I saw a lot of posts saying that there is no choice but to do
> false because there is no ipmi in raspberry pi, and I wonder how we can
> solve it in this situation.

Fencing is not juste about IPMI:
* you can use external smart devices to shut down your nodes (eg. PDU, UPS, a
  self-made fencing device
  (https://www.alteeve.com/w/Building_a_Node_Assassin_v1.1.4))
* you can fence a node by disabling its access to the network from a
  manageable switch, without shutting down the node
* you can use 1/2/3 shared storage + the hardware RPi watchdog using the SBD
  service
* you can use the internal hardware RPi watchdog using the SBD service, without
  shared disk

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF with postgresql 13?

2022-06-24 Thread Jehan-Guillaume de Rorthais

Hi,

On Wed, 22 Jun 2022 16:36:03 +
CHAMPAGNE Julie  wrote:

> ...
> # pcs resource create pgsqld ocf:heartbeat:pgsqlms \
> pgdata="/etc/postgresql/11/main" \
> bindir="/usr/lib/postgresql/11/bin" \
> datadir="/var/lib/postgresql/11/main" \
> recovery_template="/etc/postgresql/recovery.conf.pcmk" \
> op start timeout=60s \
> op stop timeout=60s \
> op promote timeout=30s \
> op demote timeout=120s \
> op monitor interval=15s timeout=10s role="Master" \
> op monitor interval=16s timeout=10s role="Slave" \
> op notify timeout=60s 
> 
> # pcs resource clone pgsqld meta notify=true

I'm not sure of the compatibility with Debian 10, but this should be either one
of these commands:

  pcs resource promotable pgsqld pgsqld-clone meta notify=true
  pcs resource master pgsqld meta notify=true

If you want to use this two step syntax, see the fine manual of pcs or eg.:

https://clusterlabs.github.io/PAF/Quick_Start-Debian-9-pcs.html#cluster-resources

Arguments "clone" and "master" have two different meanings. A "clone" ressource
is a simple one-state ressources that must be cloned and start on a various
number of node. Think httpd. The clones can either be stopped or started.
Pacemaker doesn't expect to be able promote a clone.

The "master"/"promotable" resources are an extension of clone ressources with an
additional state: stopped, started and promoted. In the last versions of
Pacemaker, we talk about clone -vs- promotable clone.

Under Debian 10, you can create your resource with the flag "promotable".
Compare previous link with this one:

https://clusterlabs.github.io/PAF/Quick_Start-Debian-10-pcs.html#cluster-resources

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] constraining multiple cloned resources to the same node

2022-03-15 Thread Jehan-Guillaume de Rorthais

Hi,

On Tue, 15 Mar 2022 12:35:11 -0400
"john tillman"  wrote:

> I'm trying to guarantee that all my cloned drbd resources start on the
> same node and I can't figure out the syntax of the constraint to do it.
> 
> I could nominate one of the drbd resources as a "leader" and have all the
> others follow it.  But then if something happens to that leader the others
> are without constraint.

I never had to setup such a thing, but isn't it a "resource set" with a
colocation constraint? See below links:

https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#s-resource-sets
https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/#using-promotable-clone-resources-in-colocation-sets

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF with postgresql 13?

2022-03-08 Thread Jehan-Guillaume de Rorthais

On Tue, 8 Mar 2022 17:44:36 +
lejeczek via Users  wrote:

> On 08/03/2022 16:20, Jehan-Guillaume de Rorthais wrote:
> > Removing the node attributes with the resource might be legit from the
> > Pacemaker point of view, but I'm not sure how they can track the dependency
> > (ping Ken?).
> >
> > PAF has no way to know the ressource is being deleted and can not remove its
> > node attribute before hand.
> >
> > Maybe PCS can look for promotable score and remove them during the "resource
> > delete" command (ping Tomas)?  
> bit of catch-22, no?

No, why? What's the emergency? 

Can you lose some data? no.

Your service not able to be brought back quickly? no.

I hope I'm not missing something, but so far, it just looks like a
misunderstanding on how to correctly bring back the service.

Maybe there's something to improve, doc or code, but we first need to explain
what you see, what it means and what you should do. So, I'll try to explain
with some more gory details, sorry in advance.

> To those of us to whom it's first foray into all "this" it 
> might be - it was to me - I see "attributes" hanging on for 
> no reason, resource does not exist, then first thing I want 
> to do is a "cleanup" - natural - but this to result in 
> inability to re-create the resource(all remain slaves) at 
> later time with 'pcs' one-liner (no cib-push) is a: no no 
> should be not...

The "no no" is: «don't use debug-promote and other debug-* command, it doesn't
work with clones». 

At least it doesn't work with pgsqlms because these commands bypass the cluster
and doesn't set some _essential_ environment variables for clones that the
cluster usually set.

You CAN recreate your resource with your one-liner. But you just miss ONE
command to trigger a promotion. See bellow the explanation.

> ... because, at later time, why should not resource 
> re/creation with 'pcs resource create' bring back those 
> 'node attrs' ?

Because this is a different situation than when you first create the resource.
When you first created the resource, there was a primary and at least one
standby. On resource creation, pgsqlms detect the primary and set its promotion
score to 1 (not even 1001, 1000, ..., just 1). Then, all the magic happen from
this very small seed.

Note that we are able know if an instance is a primary or a secondary, even when
it is stopped, by reading one of its internal file.

When you destroy the pgsqld resource, Pacemaker stop all the pgsql instance,
that means: "demote -> stop" for the primary, and "stop" for the secondaries.
Now, they are _all_ secondaries. Then, you clean the related node attribute
which were designating the previous primary.

Now, on second creation, pgsqlms will only find secondaries and is not able
to "choose" one. Because there's no promotion score, the cluster is not able to
promote one either. Pacemaker is all about scores. No scores, no actions.
From there, the cluster requires some human wisdom to chose one to promote and
set it a score with the command I gave in my previous message. Eg.:

  pcs node attribute srv2 master-pgsqld=1

Try it. Set the promotion score to 1 for one node. You'll see the cluster
react and pgsqlms recompute all new scores really quickly from there.

I don't expect users to read the source, but this is explained in comments, for
devs memory, see:

https://github.com/ClusterLabs/PAF/blob/master/script/pgsqlms#L1500

From the user perspective, this is documented here:

https://clusterlabs.github.io/PAF/configuration.html

  «
  Last but not least, during the very first startup of your cluster, the
  designated primary will be the only instance stopped gently as a primary. Take
  great care of this when you setup your cluster for the first time.
  »

I suppose I should document how to set the promotion score command somewhere...
maybe in the cookbook docs? What do you think?

Now, let's discuss.  What to do when there's no promotable score around? 

One possibility would be to check if all the instances are stopped at the same
level of data history (the LSN) and promote one of them randomly... But I bet it
would be annoying if one of them is not at the same level than the others, with
users wondering why one is promoted sometime and no-one some other times.
Moreover, this would add some more complexity to the code, and complexity is
bad for high availability.

Note that for various reasons, having a stopped cluster with instance at
different point in data history is not OK from the pgsqlms automate point of
view.

The other possibility I am actually musing on since a long time is to _remove_
this primary detection code. This would force admin to pick one primary
explicitly by setting the score by hand on resource creation. This is just a
one-time command to add when you _create_ your resource and at

Re: [ClusterLabs] PAF with postgresql 13?

2022-03-08 Thread Jehan-Guillaume de Rorthais

Hi,

Sorry, your mail was really hard to read on my side, but I think I understood
and try to answer bellow.

On Tue, 8 Mar 2022 11:45:30 +
lejeczek via Users  wrote:

> On 08/03/2022 10:21, Jehan-Guillaume de Rorthais wrote:
> >> op start timeout=60s \ op stop timeout=60s \ op promote timeout=30s  >> \
> >> op demote timeout=120s \ op monitor interval=15s   
> timeout=10s >> role="Master" meta master-max=1 \ op monitor 
> interval=16s >> timeout=10s role="Slave" \ op notify 
> timeout=60s meta notify=true > Because "op" appears, we are 
> back in resource ("pgsqld") context, > anything after is 
> interpreted as ressource and operation attributes, > even 
> the "meta notify=true". That's why your pgsqld-clone doesn't 
>  > have the meta attribute "notify=true" set.  
> Here is one-liner that should do - add, as per 'debug-' 
> suggestion, 'master-max=1'

What debug- suggestion??

...
> then do:
> 
> -> $ pcs resource delete pgsqld  
> 
> '-clone' should get removed too, so now no 'pgsqld' 
> resource(s) but cluster - weirdly in my mind - leaves node 
> attributes on.

indeed.

> I see 'master-pgsqld' with each node and do not see why 
> 'node attributes' should be kept(certainly shown) for 
> non-existent resources(to which only resources those attrs 
> are instinct)
> So, you want to "clean" that for, perhaps for now you are 
> not going to have/use 'pgsqlms', you can do that with:
> 
> -> $ pcs node attribute node1 master-pgsqld="" # same for   
> remaining nodes

indeed.

> now .. ! repeat your one-liner which worked just a moment 
> ago and you should get exact same or similar errors(while 
> all nodes are stuck on 'slave'

You have no promotion because your PostgreSQL instances has been stopped
in standby mode. The cluster has no way and no score to promote one of them.

> -> $ pcs resource debug-promote pgsqld  
> crm_resource: Error performing operation: Error occurred
> Operation force-promote for pgsqld (ocf:heartbeat:pgsqlms) 
> returned 1 (error: Can not get current node LSN location)
> /tmp:5432 - accepting connections

NEVER use "debug-promote" or other "debug-*" command with pgsqlms, or any other
cloned ressources. AFAIK, these commands works fine for "stateless" ressource,
but do not (could not) create the required environnement for the
clone and multi-state ones.

So I repeat, NEVER use "debug-promote".

What you want to do is setting the promotion score on the node you want the
promotion to happen. Eg.:

  pcs node attribute srv1 master-pgsqld=1001

You can use "crm_attribute" or "crm_master" as well.

> ocf-exit-reason:Can not get current node LSN location

This one is probably because of "debug-promote".

> You have to 'cib-push' to "fix" this very problem.
> In my(admin's) opinion this is a 100% candidate for a bug - 
> whether in PCS or PAF - perhaps authors may wish to comment?

Removing the node attributes with the resource might be legit from the
Pacemaker point of view, but I'm not sure how they can track the dependency
(ping Ken?).

PAF has no way to know the ressource is being deleted and can not remove its
node attribute before hand.

Maybe PCS can look for promotable score and remove them during the "resource
delete" command (ping Tomas)?

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF with postgresql 13?

2022-03-08 Thread Jehan-Guillaume de Rorthais

On Tue, 8 Mar 2022 12:28:06 +
CHAMPAGNE Julie  wrote:

> I didn't know the arguments order was so important. I should have read the
> doc at first! Thank you so much!!

You're welcome.

> Oh and just a last question, is there a way to prefer node1 as master for the
> resource pgsqld? I probably need to define a score on it?

I would recommend to keep the cluster symmetric to keep things simple and
reproducible. It's important. Moving the primary from a node to the other is
simple et fast enough to do it by hand with these two commands:

  pcs resource move --wait --master pgsqld-clone
  pcs resource clear pgsqld-clone

This admin cookbook is a bit outdated, but should be useful enough with no or
little changes:

  https://clusterlabs.github.io/PAF/CentOS-7-admin-cookbook.html

See chapter "Swapping primary and standby roles between nodes".

Make sure to read this page as well:

  https://clusterlabs.github.io/PAF/administration.html

Regards,

> -Message d'origine-----
> De : Jehan-Guillaume de Rorthais  
> Envoyé : mardi 8 mars 2022 11:21
> À : CHAMPAGNE Julie 
> Cc : Cluster Labs - All topics related to open-source clustering welcomed
>  Objet : Re: [ClusterLabs] PAF with postgresql 13?
> 
> Hi,
> 
> On Tue, 8 Mar 2022 08:00:22 +
> CHAMPAGNE Julie  wrote:
> 
> > I've created the ressource pgsqld as follow (don't think the cluster 
> > creation command is necessary):
> > 
> > pcs resource create pgsqld ocf:heartbeat:pgsqlms promotable \  
> 
> The problem is here. The argument order given to pcs is important. Every
> argument after "promotable" are interpreted as clone attributes. From the
> manpage:
> 
>   promotable [] []
> 
> The "promotable" should appear at the end, just before the "meta notify=true".
> 
> > PGDATA=/var/lib/postgresql/13/main \
> > bindir=/usr/lib/postgresql/13/bin  \
> > start_opts="-c 
> > config_file=/var/lib/postgresql/13/main/postgresql.conf" \  
> 
> Because of "promotable" appearing in front of these argument, they are read
> as clone attribute (for "pgsqld-clone"), where they should really be read as
> resource attribute (for "pgsqld").
> 
> (NB: I'm surprised by your "postgresql.conf" path under Debian, is it on
> purpose?)
> 
> > op start timeout=60s \
> > op stop timeout=60s \
> > op promote timeout=30s \
> > op demote timeout=120s \
> > op monitor interval=15s timeout=10s role="Master" meta master-max=1 \ 
> > op monitor interval=16s timeout=10s role="Slave" \ op notify 
> > timeout=60s meta notify=true  
> 
> Because "op" appears, we are back in resource ("pgsqld") context, anything
> after is interpreted as ressource and operation attributes, even the "meta
> notify=true". That's why your pgsqld-clone doesn't have the meta attribute
> "notify=true" set.
> 
> This argument ordering is kind of disturbing. You might prefer the alternate
> two commands form to create first "pgsqld", then "pgsqld-clone", each with
> their expected argument:
> 
>   pcs cluster cib cluster3.xml
> 
>   pcs -f cluster3.xml resource create pgsqld ocf:heartbeat:pgsqlms\
> pgdata="/etc/postgresql/13/main"  \
> bindir="/usr/lib/postgresql/13/bin"   \
> datadir="/var/lib/postgresql/13/main" \
> op start timeout=60s  \
> op stop timeout=60s   \
> op promote timeout=30s\
> op demote timeout=120s\
> op monitor interval=15s timeout=10s role="Master" \
> op monitor interval=16s timeout=10s role="Slave"  \
> op notify timeout=60s
> 
>   pcs -f cluster3.xml resource promotable pgsqld pgsqld-clone \
> meta notify=true
> 
>   [...other pcs commands for other constraints...]
> 
>   pcs cluster cib-push scope=configuration cluster3.xml
> 
> By the way, these commands has been adapted from the one-form command detailed
> here: https://clusterlabs.github.io/PAF/Quick_Start-Debian-10-pcs.html
> 
> Make sure to read/adapt from this page.
> 
> > BTW, I had to edit the file /usr/lib/ocf/resource.d/heartbeat/pgsqlms 
> > because the default values of bindir, pgdata didn't match the 
> > Debian/postgresql default settings:
> > 
> > # Default parameters values
> > my $system_user_default = "postgres";
> > my $bindir_default  = "/usr/lib/postgresql/13/bin"; 
> > my $pgdata_default  = "/var/lib/postgresql/13/main";  
> 
> You might now understand you should not have to edit these fields if the
> resource are correctly setup :)
> 
> Regards,

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF with postgresql 13?

2022-03-08 Thread Jehan-Guillaume de Rorthais

Hi,

On Tue, 8 Mar 2022 08:00:22 +
CHAMPAGNE Julie  wrote:

> I've created the ressource pgsqld as follow (don't think the cluster creation
> command is necessary):
> 
> pcs resource create pgsqld ocf:heartbeat:pgsqlms promotable \

The problem is here. The argument order given to pcs is important. Every
argument after "promotable" are interpreted as clone attributes. From the
manpage:

  promotable [] []

The "promotable" should appear at the end, just before the "meta notify=true".

> PGDATA=/var/lib/postgresql/13/main \
> bindir=/usr/lib/postgresql/13/bin  \
> start_opts="-c config_file=/var/lib/postgresql/13/main/postgresql.conf" \

Because of "promotable" appearing in front of these argument, they are read as
clone attribute (for "pgsqld-clone"), where they should really be read as
resource attribute (for "pgsqld").

(NB: I'm surprised by your "postgresql.conf" path under Debian, is it on
purpose?)

> op start timeout=60s \
> op stop timeout=60s \
> op promote timeout=30s \
> op demote timeout=120s \
> op monitor interval=15s timeout=10s role="Master" meta master-max=1 \
> op monitor interval=16s timeout=10s role="Slave" \
> op notify timeout=60s meta notify=true

Because "op" appears, we are back in resource ("pgsqld") context, anything
after is interpreted as ressource and operation attributes, even the 
"meta notify=true". That's why your pgsqld-clone doesn't have the
meta attribute "notify=true" set.

This argument ordering is kind of disturbing. You might prefer the alternate
two commands form to create first "pgsqld", then "pgsqld-clone", each with
their expected argument:

  pcs cluster cib cluster3.xml

  pcs -f cluster3.xml resource create pgsqld ocf:heartbeat:pgsqlms\
pgdata="/etc/postgresql/13/main"  \
bindir="/usr/lib/postgresql/13/bin"   \
datadir="/var/lib/postgresql/13/main" \
op start timeout=60s  \
op stop timeout=60s   \
op promote timeout=30s\
op demote timeout=120s\
op monitor interval=15s timeout=10s role="Master" \
op monitor interval=16s timeout=10s role="Slave"  \
op notify timeout=60s

  pcs -f cluster3.xml resource promotable pgsqld pgsqld-clone \
meta notify=true

  [...other pcs commands for other constraints...]

  pcs cluster cib-push scope=configuration cluster3.xml

By the way, these commands has been adapted from the one-form command detailed
here: https://clusterlabs.github.io/PAF/Quick_Start-Debian-10-pcs.html

Make sure to read/adapt from this page.

> BTW, I had to edit the file /usr/lib/ocf/resource.d/heartbeat/pgsqlms because
> the default values of bindir, pgdata didn't match the Debian/postgresql
> default settings:
> 
> # Default parameters values
> my $system_user_default = "postgres";
> my $bindir_default  = "/usr/lib/postgresql/13/bin"; 
> my $pgdata_default  = "/var/lib/postgresql/13/main";

You might now understand you should not have to edit these fields if the
resource are correctly setup :)

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF with postgresql 13?

2022-03-07 Thread Jehan-Guillaume de Rorthais

On Mon, 7 Mar 2022 14:49:35 +
CHAMPAGNE Julie  wrote:

> The return gives nothing for the first command.
> Then:
> 
> name="test-debug" host="node1" value="testvalue" for node1.
> 
> After executing both commands on node2, it gives me the following return on
> both server:
> 
> name="test-debug" host="node2" value="testvalue"
> name="test-debug" host="node1" value="testvalue"

Well, everything sounds fine.

There was a bug in Pacemaker 2.1 few weeks/months ago that has been fixed
quickly. I was wondering if there was some trouble with 2.0 around the same
issue, but it doesn't.

I don't understand why the "lsn_location" is empty here. It should have been
set during the pre-promote actions...

... wait, now I think about notify actions, it looks like you set "notify=true"
on "pgsqld", not on "pgsqld-clone"? Quoting your previous email, I can see
multiple errors:

...
Resources:

Clone: pgsqld-clone

  Meta Attrs: PGDATA=/var/lib/postgresql/13/main
  bindir=/usr/lib/postgresql/13/bin promotable=true start_opts="-c
  config_file=/var/lib/postgresql/13/main/postgresql.conf"

  Resource: pgsqld (class=ocf provider=heartbeat type=pgsqlms)

   Meta Attrs: master-max=1 notify=true

...

1. the parameter is pgdata, not "PGDATA"
2. "pgdata", "bindir", "start_opts" are not meta attributes, they are simple
   attributes
3. "master-max=1" and "notify=true" are not supposed to be meta attributes of
   "pgsqld", but meta attributes of "pgsqld-clone".

Could you share the command used to setup this cluster? There's something wrong
with them.

I am surprised pgsqlms did not error'ed with this setup...

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF with postgresql 13?

2022-03-07 Thread Jehan-Guillaume de Rorthais

On Mon, 7 Mar 2022 14:32:46 +
CHAMPAGNE Julie  wrote:

> root@node1 ~ > attrd_updater --private --lifetime reboot --name
> "lsn_location-pgsqld" --query Could not query value of lsn_location-pgsqld:
> attribute does not exist

Mh, sorry, could you please exec these two commands:

  attrd_updater --private --lifetime reboot --name "test-debug" --update 
"testvalue" 

  attrd_updater --private --lifetime reboot --name "test-debug" --query

You might need to repeat the second command until it find the "testvalue" value.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF with postgresql 13?

2022-03-07 Thread Jehan-Guillaume de Rorthais

Hi,

Caution, this is an english spoken mailing list :)

Bellow my answer.

On Mon, 7 Mar 2022 12:31:07 +
CHAMPAGNE Julie  wrote:

> Lorsque je crée un problème sur le noeud1, 

What's the issue you are testing precisely?

>   * pgsqld_promote_0 on node2 'error' (1): call=24, status='complete',
> exitreason='Can not get current node LSN location',

It seems the agent had some trouble getting some private attributes from the
cluster. Could you give exact:

* Debian version
* PAF version

Do you find any error in logs about setting/getting lsn_location attribute ?

What is the result of the following command:

  attrd_updater --private --lifetime reboot --name "lsn_location-pgsqld" --query

Thanks,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] backup of postgres database from cluster

2022-03-05 Thread Jehan-Guillaume de Rorthais

Hi,

On Wed, 2 Mar 2022 14:39:40 +0100
damiano giuliani  wrote:

> ...
> my question is: what happens in case of failover of the master on another
> node to the wal logs that i am archiving to build the incrementals?

The new primary is supposed to archive WALs to your backup server.

> would I still have a consistent backupset in case of a node failover?

Yes, if archiving is done correctly.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF CentOS RPM

2022-02-22 Thread Jehan-Guillaume de Rorthais

On Tue, 22 Feb 2022 12:25:15 +0100
Oyvind Albrigtsen  wrote:

> ...
> >Ping Oyvind, maybe you have some input about this as the resource-agents
> >package maintainer?  
> I dont know how it got excluded on CentOS Stream only, but I've
> created a bz to fix it:
> https://bugzilla.redhat.com/show_bug.cgi?id=2056926

Thank you!
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF CentOS RPM

2022-02-22 Thread Jehan-Guillaume de Rorthais

Hello,

On Tue, 22 Feb 2022 09:27:16 +
lejeczek via Users  wrote:

> ...
> Perhaps as the author(s) you can chip in and/or help via comments to 
> rectify this:
> 
> ...
> 
>   Problem: package resource-agents-paf-4.9.0-7.el8.x86_64 requires 

PAF doesn't share the same release plans than the resource-agents project, but
it seems RH included it in their build process as part as the resource-agents
one, releasing it with the same version number:
https://bugzilla.redhat.com/show_bug.cgi?id=1872754

RH is delivering PAF since the RHSA-2021:4139 security fixes and update,
in november 2021: https://access.redhat.com/errata/RHSA-2021:4139

I wasn't aware of this packaging and how it is built, neither of the repository
it is delivered to.

I am only aware of the RPM I am delivering on github, and the one provided by
the PGDG repository: https://yum.postgresql.org/packages/

> ...
> How I understand that is that CentOS guys/community have PAF in 
> 'resilientstorage' repo.

I have no idea why the PAF package is in this repo and not in the
HighAvailability one where I suppose it should be host. Compare:

* http://mirror.centos.org/centos/8-stream/HighAvailability/x86_64/os/Packages/
* http://mirror.centos.org/centos/8-stream/ResilientStorage/x86_64/os/Packages/

Ping Oyvind, maybe you have some input about this as the resource-agents
package maintainer?

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF with postgresql 13?

2022-02-21 Thread Jehan-Guillaume de Rorthais

On Mon, 21 Feb 2022 09:04:27 +
CHAMPAGNE Julie  wrote:
...
> The last release is 2 years old, is it still in development?

There's no activity because there's not much to do on it. PAF is mainly in
maintenance (bug fix) mode.

I have few ideas here and there. It might land soon or later, but nothing
really fancy. It just works.

The current effort is to reborn the old workshop that was written few years ago
to translate it to english.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Help with PostgreSQL Automatic Failover demotion

2022-02-18 Thread Jehan-Guillaume de Rorthais

Hello,

On Fri, 18 Feb 2022 21:44:58 +
"Larry G. Mills"  wrote:

> ... This happened again recently, and the running primary DB was demoted and
> then re-promoted to be the running primary. What I'm having trouble
> understanding is why the running Master/primary DB was demoted.  After the
> monitor operation timed out, the failcount for the ha-db resource was still
> less than the configured "migration-threshold", which is set to 5.

Because "migration-threshold" is the limit before the resource is moved away
from the node.

As long as your failcount is less than "migration-threshold" and the failure
is not fatal, the cluster will keep the resource on the same node and try to
"recover" it by running a full restart: demote -> stop -> start -> promote.

Since 2.0, the recover action can be demote -> promote. See the "on-fail"
property and the detail about it below the table:

https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#operation-properties

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] Re: what is the "best" way to completely shutdown a two-node cluster ?

2022-02-11 Thread Jehan-Guillaume de Rorthais

On Fri, 11 Feb 2022 08:07:33 +0100
"Ulrich Windl"  wrote:

> >> Jehan-Guillaume de Rorthais  schrieb am 10.02.2022 um  
> 16:40 in Nachricht <20220210164000.2e395a37@karst>:
> >  ...
> > I wonder if after the cluster shutdown complete, the target-role=Stopped 
> > could be removed/edited offline with eg. crmadmin? That would make
> > VirtualDomain startable on boot.  
> 
> It has also discussed before: "restart" is implemented by "first change role
> to stopped, then change role to started".
> If the performing node is fenced due to a stop failure, the resource is never
> started.
> So what's needed is a transient (i.e.: not saved in CIB) "restart" operation,
> that reverts to the previous state (started, most likely) if the the node
> performing it dies.
> Now transfer this to "stop-all-resources": The role attribute in the CIB would
> never be changed, but maybe just all the LRMs would stop their resources,
> eventually shutting down and when the node comes up again, the previous state
> will be re-established.

Nice, indeed.

> > ...
> > Last, if Bernd need to stop gracefully the VirtualDomain paying attention to
> > the I/O load, maybe he doesn't want them start automatically on boot for
> > the exact same reason anyway?  
> 
> But you can limit the number of concurrent invocations and migrations, right?
> Unfortunately I cannot remember the parameter.

I suppose you are talking about cluster option "batch-limit"?
But then, this would have to be set to eg. "batch-limit=1" (?) only during
shutdown/startup action, to avoid slowing down cluster action during normal
operation.

++
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] what is the "best" way to completely shutdown a two-node cluster ?

2022-02-10 Thread Jehan-Guillaume de Rorthais

On Thu, 10 Feb 2022 22:15:07 +0800
Roger Zhou via Users  wrote:

> 
> On 2/9/22 17:46, Lentes, Bernd wrote:
> > 
> > 
> > - On Feb 7, 2022, at 4:13 PM, Jehan-Guillaume de Rorthais
> > j...@dalibo.com wrote:
> > 
> >> On Mon, 7 Feb 2022 14:24:44 +0100 (CET)
> >> "Lentes, Bernd"  wrote:
> >>
> >>> Hi,
> >>>
> >>> i'm currently changing a bit in my cluster because i realized that my
> >>> configuration for a power outtage didn't work as i expected. My idea is
> >>> currently:
> >>> - first stop about 20 VirtualDomains, which are my services. This will
> >>> surely takes some minutes. I'm thinking of stopping each with a time
> >>> difference of about 20 seconds for not getting to much IO load. and then
> >>> ...
> 
> This part is tricky. At one hand, it is good thinking to throttle IO load.
> 
> On the other hand, as Jehan and Ulrich mentioned, `crm resource stop ` 
> introduces "target‑role=Stopped" for each VirtualDomain, and have to do `crm 
> resource start ` to changed it back to "target‑role=Started" to start
> them after the power outage.

I wonder if after the cluster shutdown complete, the target-role=Stopped could
be removed/edited offline with eg. crmadmin? That would make VirtualDomain
startable on boot.

I suppose this would not be that simple as it would require to update it on all
nodes, taking care of the CIB version, hash, etc... But maybe some tooling
could take care of this?

Last, if Bernd need to stop gracefully the VirtualDomain paying attention to
the I/O load, maybe he doesn't want them start automatically on boot for the
exact same reason anyway?

++
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] Re: what is the "best" way to completely shutdown a two‑node cluster ?

2022-02-10 Thread Jehan-Guillaume de Rorthais

On Thu, 10 Feb 2022 15:10:20 +0100
"Ulrich Windl"  wrote:
...
> > If you want to gracefully shutdown your cluster, then you can add one  
> manual
> > step to first gracefully stop your resources instead of betting the cluster
> > will do the good things.  
> 
> It's the old discussion: Old HP ServiceGuard had commands to halt a node and
> the halt the cluster; pacemaker never had a command to halt the cluster.
> IMHO that would be more important [...]

Agree.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] what is the "best" way to completely shutdown a two-node cluster ?

2022-02-10 Thread Jehan-Guillaume de Rorthais

On Wed, 9 Feb 2022 17:42:35 + (UTC)
Strahil Nikolov via Users  wrote:

> If you gracefully shutdown a node - pacemaker will migrate all resources away
>  so you need to shut them down simultaneously and all resources should be
> stopped by the cluster.
> 
> Shutting down the nodes would be my choice.

If you want to gracefully shutdown your cluster, then you can add one manual
step to first gracefully stop your resources instead of betting the cluster
will do the good things.

As far as I remember, there's no way the DC/CRM can orchestrate the whole
cluster shutdown gracefully in the same transition. So I prefer standing on the
safe side and add one step to my procedure.

I even add it with commands like `pcs cluster stop --all` which tries to
shutdown all the nodes "kind of" in the same time everywhere. At least, I know
where my resources were stopped and how they will start. It might be important
when you deal with eg. permanent promotion scores.

++
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] what is the "best" way to completely shutdown a two-node cluster ?

2022-02-09 Thread Jehan-Guillaume de Rorthais

On Wed, 9 Feb 2022 10:46:30 +0100 (CET)
"Lentes, Bernd"  wrote:

> - On Feb 7, 2022, at 4:13 PM, Jehan-Guillaume de Rorthais j...@dalibo.com
> wrote:
> 
> > On Mon, 7 Feb 2022 14:24:44 +0100 (CET)
> > "Lentes, Bernd"  wrote:
> >   
> >> Hi,
> >> 
> >> i'm currently changing a bit in my cluster because i realized that my
> >> configuration for a power outtage didn't work as i expected. My idea is
> >> currently:
> >> - first stop about 20 VirtualDomains, which are my services. This will
> >> surely takes some minutes. I'm thinking of stopping each with a time
> >> difference of about 20 seconds for not getting to much IO load. and then
> >> ...
> >> - how to stop the other resources ?  
> > 
> > I would set cluster option "stop-all-resources" so all remaining resources
> > are stopped gracefully by the cluster.
> > 
> > Then you can stop both nodes using eg. "crm cluster stop".
> > 
> > On restart, after both nodes are up and joined to the cluster, you can set
> > "stop-all-resources=false", then start your VirtualDomains.  
> 
> Aren't  the VirtualDomains already started by "stop-all-resources=false" ?

I'm not sure how "crm resource stop " actually stop a resource. I thought
it would set "target-role=Stopped", but I might be wrong.

If "crm resource stop" actually use "target-role=Stopped", I believe the
resources would not start automatically after setting
"stop-all-resources=false".

++
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] what is the "best" way to completely shutdown a two-node cluster ?

2022-02-07 Thread Jehan-Guillaume de Rorthais

On Mon, 7 Feb 2022 14:24:44 +0100 (CET)
"Lentes, Bernd"  wrote:

> Hi,
> 
> i'm currently changing a bit in my cluster because i realized the my
> configuration for a power outtage didn't work as i expected. My idea is
> currently:
> - first stop about 20 VirtualDomains, which are my services. This will surely
> takes some minutes. I'm thinking of stopping each with a time difference of
> about 20 seconds for not getting to much IO load. and then ...
> - how to stop the other resources ?

I would set cluster option "stop-all-resources" so all remaining resources are
stopped gracefully by the cluster.

Then you can stop both nodes using eg. "crm cluster stop".

On restart, after both nodes are up and joined to the cluster, you can set
"stop-all-resources=false", then start your VirtualDomains.

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] Removing a resource without stopping it

2022-01-31 Thread Jehan-Guillaume de Rorthais

On Mon, 31 Jan 2022 08:49:44 +0100
Klaus Wenninger  wrote:
...
> Depending on the environment it might make sense to think about
> having the manual migration-step controlled by the cluster(s) using
> booth. Just thinking - not a specialist on that topic ...

Could you elaborate a bit on this?

Boothd allows to start/stop a ressource in the cluster currently owning the
associated ticket. In this regard, this could help to stop the resource on one
side and start it on the other one.

However, as far as I know, there's no action like migrate-to/migrate-from that
could be executed across multiple clusters to deal with the migration steps
between both clusters... or does it?

++
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Removing a resource without stopping it

2022-01-31 Thread Jehan-Guillaume de Rorthais

Hi,

On Sat, 29 Jan 2022 16:51:47 -0500
Digimer  wrote:

> ...
> Though going back to the original question, deleting the server from
> pacemaker while the VM is left running, is still something I am quite curious
> about.

As the real resource moved away, meaning it couldn't be stopped locally and it
will be deleted shortly after, what about changing the on-fail property of the
"stop" resource action to "ignore" ?

https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#operation-properties

cf.: « ignore: Pretend the resource did not fail »

++
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF recover master after crash

2022-01-21 Thread Jehan-Guillaume de Rorthais

Le Fri, 21 Jan 2022 18:17:04 +0100,
damiano giuliani  a écrit :

> Ehy,
> 
> Take in account when a master node crash, you should re-allign the old
> master into the slave using pg_basebackup/pg_rewind and then rejoin the
> node into the cluster as a slave. This is the only way to avoid data
> corruption and be sure the new slave is correcly synchronised with the new
> master.

+1, thanks Damiano :)

Plus, see documentation:
https://clusterlabs.github.io/PAF/administration.html#failover

And more generaly: https://clusterlabs.github.io/PAF/documentation.html

> Diskless sbd need at least 3 nodes so i expect you are using classic sbd
> disk shared

+1

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] How to globally enable trace log level in pacemaker?

2021-10-31 Thread Jehan-Guillaume de Rorthais

Hi,

Under EL and Debian, there's a PCMK_debug variable (iirc) in 
"/etc/sysconfig/pacemaker" or "/etc/default/pacemaker".

Comments in there explain how to set debug mode for part or all of the 
pacemaker processes.

This might be the environment variable you are looking for ?

Regards,

Le 31 octobre 2021 09:20:00 GMT+01:00, Andrei Borzenkov  a 
écrit :
>I think it worked in the past by passing a lot of -VVV when starting
>pacemaker. It does not seem to work now. I can call /usr/sbin/pacemakerd
>-..., but it does pass options further to children it
>starts. So every other daemon is started without any option and with
>default log level.
>
>This pacemaker 2.1.0 from openSUSE Tumbleweed.
>
>P.S. environment variable to directly set log level would certainly be
>helpful.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: Re: Antw: [EXT] unexpected fenced node and promotion of the new master PAF ‑ postgres

2021-10-12 Thread Jehan-Guillaume de Rorthais

On Tue, 12 Oct 2021 09:46:04 +0200
"Ulrich Windl"  wrote:

> >>> Jehan-Guillaume de Rorthais  schrieb am 12.10.2021 um
> >>> 09:35 in  
> Nachricht <20211012093554.4bb761a2@firost>:
> > On Tue, 12 Oct 2021 08:42:49 +0200
> > "Ulrich Windl"  wrote:
> >   
> ...
> >> "watch cat /proc/meminfo" could be your friend.  
> > 
> > Or even better, make sure you have sysstat or pcp tools family installed and
> > harvesting system metrics. You'll have the full historic of the dirty pages
> > variations during the day/week/month.  
> 
> Actually I think the 10 minute granularity of sysstat (sar) is to coarse to
> learn what's going on, specifically if your node is fenced before the latest
> record is written.

Indeed. You can still set it down to 1min in the crontab if needed. But the
point is to gather a better understanding on the dirties (and many other useful 
metrics) evolution during a long time frame.

You will always loose a small part of information after a fencing. No matter if
your period is 10min, 1min or even 1s.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: Re: Antw: [EXT] unexpected fenced node and promotion of the new master PAF ‑ postgres

2021-10-12 Thread Jehan-Guillaume de Rorthais

On Tue, 12 Oct 2021 08:42:49 +0200
"Ulrich Windl"  wrote:

> ...
> >> sysctl ‑a | grep dirty
> >> vm.dirty_background_bytes = 0
> >> vm.dirty_background_ratio = 10  
> > 
> > Considering your 256GB of physical memory, this means you can dirty up to 
> > 25GB
> > pages in cache before the kernel start to write them on storage.
> > 
> > You might want to trigger these background, lighter syncs much before 
> > hitting
> > this limit.
> >   
> >> vm.dirty_bytes = 0
> >> vm.dirty_expire_centisecs = 3000
> >> vm.dirty_ratio = 20  
> > 
> > This is 20% of your 256GB physical memory. After this limit, writes have to
> > go to disks, directly. Considering the time to write to SSD compared to
> > memory and the amount of data to sync in the background as well (52GB),
> > this could be very painful.  
> 
> Wowever (unless doing really large commits) databases should flush buffers
> rather frequently, so I doubt database operations would fill the dirty buffer
> rate.

It depends on you database setup, your concurrency, your active dataset, your
query profile, batch, and so on.

> "watch cat /proc/meminfo" could be your friend.

Or even better, make sure you have sysstat or pcp tools family installed and
harvesting system metrics. You'll have the full historic of the dirty pages
variations during the day/week/month.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] unexpected fenced node and promotion of the new master PAF ‑ postgres

2021-10-11 Thread Jehan-Guillaume de Rorthais

Hi,

I kept the full answer in history to keep the list informed of your full
answer.

My answer down below.

On Mon, 11 Oct 2021 11:33:12 +0200
damiano giuliani  wrote:

> ehy guys sorry for being late, was busy during the WE
> 
> here i im:
> 
> 
> > Did you see the swap activity (in/out, not just swap occupation) happen in
> > the
> >
> > same time the member was lost on corosync side?
> > Did you check corosync or some of its libs were indeed in swap?
> >
> >
> no and i dont know how do it, i just noticed the swap occupation which
> suggest me (and my collegue) to find out if it could cause some trouble.
> 
> > First, corosync now sit on a lot of memory because of knet. Did you try to
> > switch back to udpu which is using way less memory?
> 
> 
> No i havent move to udpd, cast stop processes at all.
> 
>   "Could not lock memory of service to avoid page faults"
> 
> 
> grep -rn 'Could not lock memory of service to avoid page faults' /var/log/*
> returns noting

This message should appears on corosync startup. Make sure the logs hadn't been
rotated to a blackhole in the meantime...

> > On my side, mlocks is unlimited on ulimit settings. Check the values
> > in /proc/$(coro PID)/limits (be careful with the ulimit command, check the
> > proc itself).
> 
> 
> cat /proc/101350/limits
> Limit Soft Limit   Hard Limit   Units
> Max cpu time  unlimitedunlimitedseconds
> Max file size unlimitedunlimitedbytes
> Max data size unlimitedunlimitedbytes
> Max stack size8388608  unlimitedbytes
> Max core file size0unlimitedbytes
> Max resident set  unlimitedunlimitedbytes
> Max processes 770868   770868
> processes
> Max open files1024 4096 files
> Max locked memory unlimitedunlimitedbytes
> Max address space unlimitedunlimitedbytes
> Max file locksunlimitedunlimitedlocks
> Max pending signals   770868   770868   signals
> Max msgqueue size 819200   819200   bytes
> Max nice priority 00
> Max realtime priority 00
> Max realtime timeout  unlimitedunlimitedus
> 
> Ah... That's the first thing I change.
> > In SLES, that is defaulted to 10s and so far I have never seen an
> > environment that is stable enough for the default 1s timeout.
> 
> 
> old versions have 10s default
> you are not going to fix the problem lthis way, 1s timeout for a bonded
> network and overkill hardware is enourmous time.
> 
> hostnamectl | grep Kernel
> Kernel: Linux 3.10.0-1160.6.1.el7.x86_64
> [root@ltaoperdbs03 ~]# cat /etc/os-release
> NAME="CentOS Linux"
> VERSION="7 (Core)"
> 
> > Indeed. But it's an arbitrage between swapping process mem or freeing
> > mem by removing data from cache. For database servers, it is advised to
> > use a
> > lower value for swappiness anyway, around 5-10, as a swapped process means
> > longer query, longer data in caches, piling sessions, etc.
> 
> 
> totally agree, for db server swappines has to be 5-10.
> 
> kernel?
> > What are your settings for vm.dirty_* ?
> 
> 
> 
> hostnamectl | grep Kernel
> Kernel: Linux 3.10.0-1160.6.1.el7.x86_64
> [root@ltaoperdbs03 ~]# cat /etc/os-release
> NAME="CentOS Linux"
> VERSION="7 (Core)"
> 
> 
> sysctl -a | grep dirty
> vm.dirty_background_bytes = 0
> vm.dirty_background_ratio = 10

Considering your 256GB of physical memory, this means you can dirty up to 25GB
pages in cache before the kernel start to write them on storage.

You might want to trigger these background, lighter syncs much before hitting
this limit.

> vm.dirty_bytes = 0
> vm.dirty_expire_centisecs = 3000
> vm.dirty_ratio = 20

This is 20% of your 256GB physical memory. After this limit, writes have to go
to disks, directly. Considering the time to write to SSD compared to memory
and the amount of data to sync in the background as well (52GB), this could be
very painful.

> vm.dirty_writeback_centisecs = 500
> 
> 
> > Do you have a proof that swap was the problem?
> 
> 
> not at all but after switch to swappiness to 10, cluster doesnt sunndletly
> swap anymore from a month
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] unexpected fenced node and promotion of the new master PAF ‑ postgres

2021-10-09 Thread Jehan-Guillaume de Rorthais

On Sat, 9 Oct 2021 09:55:28 +0300
Andrei Borzenkov  wrote:

> On 08.10.2021 16:00, damiano giuliani wrote:
> > ...
> > the servers are all resoruce overkills with 80 cpus and 256 gb ram even if
> > the db ingest milions records x day, the network si bonded 10gbs, ssd disks.

I don't remember if we discussed this: have you ever experienced some IO burst
during eg. batch or high R/W concurrency? I'm thinking about IO/cache pressure
where the system can stall at some point...
We even experienced really bad IO behavior under pressure because of some
misunderstanding between MegaRAID and kernel 4.x (4.15 maybe?)...

What are your settings for vm.dirty_* ?

> > ...
> > So it turn out that a lil bit of swap was used and i suspect corosync
> > process were swapped to disks creating lag where 1s default corosync
> > timeout was not enough.
> 
> But you do not know whether corosync was swapped out at all. So it is
> just guess.

Exact. Moreover, corosync mlock itself. As I wrote in earlier answer, Damiano
should probably check its Corosync logs for error on **startup**.

> > So it is, swap doesnt log anything and moving process to allocated ram to
> > swap take times more that 1s default timeout (probably many many mores).
> > i fix it changing the swappiness of each servers to 10 (at minimum)
> > avoinding the corosync process could swap.
> 
> swappiness kernel parameter does not really prevent swap from being used.

Indeed. But it's an arbitrage between swapping process mem or freeing
mem by removing data from cache. For database servers, it is advised to use a
lower value for swappiness anyway, around 5-10, as a swapped process means
longer query, longer data in caches, piling sessions, etc.

But I still doubt corosync could be swapped, unless it cried on startup it
couldn't mlock its memory.

> What is your kernel version? On several consecutive kernel versions I
> observed the following effect - once swap started being used at all
> system experienced periodical stalls for several seconds. It feeled like
> frozen system. It did not matter how much swap was in allocated -
> several megabytes was already enough.
> 
> As far as I understand, the problem was not really time to swap out/in,
> but time kernel spent traversing page tables to make decision. I think
> it start with kernel 5.3 (or may be 5.2) and I do not see it any more
> since I believe kernel 5.7.

Interesting.

++
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] unexpected fenced node and promotion of the new master PAF ‑ postgres

2021-10-09 Thread Jehan-Guillaume de Rorthais



Le 9 octobre 2021 00:11:27 GMT+02:00, Strahil Nikolov  a 
écrit :
>What do you mean by 1s default timeout ?

I suppose Damiano is talking about the corosync totem token timeout.

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] unexpected fenced node and promotion of the new master PAF ‑ postgres

2021-10-08 Thread Jehan-Guillaume de Rorthais

On Fri, 8 Oct 2021 15:00:30 +0200
damiano giuliani  wrote:

> Hi Guys,

Hi,

Good to hear from you, thank for the follow up!

My answer below.

> ...
> So it turn out that a lil bit of swap was used and i suspect corosync
> process were swapped to disks creating lag where 1s default corosync
> timeout was not enough.

Did you see the swap activity (in/out, not just swap occupation) happen in the
same time the member was lost on corosync side?

Did you check corosync or some of its libs were indeed in swap?

> So it is, swap doesnt log anything and moving process to allocated ram to
> swap take times more that 1s default timeout (probably many many mores).

Well, I have two different thoughts.

First, corosync now sit on a lot of memory because of knet. Did you try to
switch back to udpu which is using way less memory?

Second, a colleague suggested me to check if corosync mlock itself. And indeed
it mlockall (see mlock(2)) itself in physical memory. The mlock call might
fail, but the error doesn't stop corosync from starting anyway. Check your logs
for error:

  "Could not lock memory of service to avoid page faults"

On my side, mlocks is unlimited on ulimit settings. Check the values
in /proc/$(coro PID)/limits (be careful with the ulimit command, check the proc
itself).

> i fix it changing the swappiness of each servers to 10 (at minimum)
> avoinding the corosync process could swap.

That would be my first reflex as well. Keep us informed if the definitely fixed
your failover troubles.

>  this issue which should be easy drove me crazy because nowhere process
> swap is tracked on logs but make corosync trigger the timeout and make the
> cluster failover.

This is really interesting and useful.

Thanks,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] unexpected fenced node and promotion of the new master PAF ‑ postgres

2021-07-23 Thread Jehan-Guillaume de Rorthais

On Fri, 23 Jul 2021 12:52:00 +0200
damiano giuliani  wrote:

> the time query isnt the problem, is known that took its time. the network
> is 10gbs bonding, quite impossible to sature with queries :=).

Everything is possible, it's just harder :)

[...]
> checking again the logs what for me is not clear its the cause of the loss
> of quorum and then fence the node.

As said before, according to logs from other nodes, ltaoperdbs02 did not
answers to the TOTEM protocol anymore, so it left the communication group. But
worse, it did it without saying goodbye properly:

  > [TOTEM ] Failed to receive the leave message. failed: 1 

>From this exact time, the node is then considered "uncleaned", aka
its state "unknown". To solve this trouble, the cluster needs to fence it to
set a predictable state: OFF. So, the reaction to the trouble is sane.

Now, from the starting point of this conversation, the question is what
happened? Logs on other nodes will probably not help, as they just witnessed a
node disappearing without any explanation.

Logs from ltaoperdbs02 might help, but the corosync log you sent stop at
00:38:44, almost 2 minutes before the fencing as reported from other nodes:

  > Jul 13 00:40:37 [228699] ltaoperdbs03pengine:  warning: pe_fence_node:
  >Cluster node ltaoperdbs02 will be fenced: peer is no longer part of

> So the cluster works flawessy as expected: as soon ltaoperdbs02 become
> "unreachable", it formed a new quorum, fenced the lost node and promoted
> the new master.

exact.

> What i cant findout is WHY its happened.
> there are no useful  information into the system logs neither into the
> Idrac motherboard logs.

Because I suppose some log where not synced to disks when the server has been
fenced.

Either the server clocks were not synched (I doubt), or you really lost almost
2 minutes of logs.

> There is a way to improve or configure a log system for fenced / failed
> node?

Yes:

1.setup rsyslog to export logs on some dedicated logging servers. Such
servers should receive and save logs from your clusters and other hardwares
(network?) and keep them safe. You will not loose messages anymore.

2. Gather a lot of system metrics and keep them safe (eg. export them using pcp,
collectd, etc). Metrics and visualization are important to cross-compare with
logs and pinpoint something behaving outside of the usual scope.

Looking at your log, I still find your query time are suspicious. I'm not
convinced they are the root cause, they might be just a bad symptom/signal
of something going wrong there. Having a one-row INSERT taking 649.754ms is
suspicious. Maybe it's just a locking problem, maybe there's some CPU-bound
postgis things involved, maybe with some GIN or GiST indexes, but it's still
suspicious considering the server is over-sized in performance as you stated...

And maybe the network or SAN had a hick-up and corosync has been too sensible
to it. Check the retransmit and timeout parameters?

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] Re: Two node cluster without fencing and no split brain?

2021-07-22 Thread Jehan-Guillaume de Rorthais

On Thu, 22 Jul 2021 15:36:03 +0200
"Ulrich Windl"  wrote:

> >>> Jehan-Guillaume de Rorthais  schrieb am 22.07.2021 um
> 12:05 in
> Nachricht <20210722120537.0d65c2a1@firost>:
> > On Wed, 21 Jul 2021 22:02:21 -0400
> > "Frank D. Engel, Jr."  wrote:
> > 
> >> In OpenVMS, the kernel is aware of the cluster.  As is mentioned in that 
> >> presentation, it actually stops processes from running and blocks access 
> >> to clustered storage when quorum is lost, and resumes them appropriately 
> >> once it is re-established.
> >> 
> >> In other words... no reboot, no "death" of the cluster node or special 
> >> arrangements with storage hardware...  If connectivity is restored, the 
> >> services are simply resumed.
> > 
> > Well, when losing the quorum, by default Pacemaker stop its local
> resources.
> 
> But when a node without quorum performs any actions it may corrupt data (e.g.
> writing to a non-shared filesystem like ext3 on a shared medium like iSCSI or
> FC_SAN).

In the case you are describing, the storage itself should forbid the situation
where a non shared filesystem could be mounted in multiple server in the same
time.

If you can't do this on the storage side, the simplest way to do it is using the
lvm systemid restriction (lvmsystemid(7)). This restriction strictly allows 0 or
1 node to access the shared VG. The name of the node allowed to activate the VG
is written on storage side. LVM will fails on any other node trying to activate
the shared VG. There's a pacemaker agent taking care of this.

I did some PoC using this, this is really easy to manage.

But I suspect OP is talking about a distributed clustered FS anyway, so this is
a completely different beast I never dealt with...

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF agents after packages updates - still work?

2021-07-22 Thread Jehan-Guillaume de Rorthais

On Sat, 19 Jun 2021 08:32:02 +0100
lejeczek  wrote:

> I've just yesterday updated OS packages among which some 
> were for various PCS components, to versions:
> corosynclib-3.1.0-5.el8.x86_64
> pacemaker-schemas-2.1.0-2.el8.noarch
> pacemaker-cluster-libs-2.1.0-2.el8.x86_64
> pacemaker-cli-2.1.0-2.el8.x86_64
> pacemaker-libs-2.1.0-2.el8.x86_64
> resource-agents-paf-2.3.0-1.noarch
> pacemaker-2.1.0-2.el8.x86_64
> resource-agents-4.1.1-96.el8.x86_64
> corosync-3.1.0-5.el8.x86_64
> pcs-0.10.8-2.el8.x86_64

For the sake of the mailing list history, lejeczek opened an issue there:

  https://github.com/ClusterLabs/PAF/issues/194

To make it short: this is a temporary regression in Pacemaker 2.1. This has
been fixed and should be released soon.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] unexpected fenced node and promotion of the new master PAF ‑ postgres

2021-07-22 Thread Jehan-Guillaume de Rorthais

Hi,

On Wed, 14 Jul 2021 07:58:14 +0200
"Ulrich Windl"  wrote:
[...]
> Could it be that a command saturated the network?
> Jul 13 00:39:28 ltaoperdbs02 postgres[172262]: [20-1] 2021-07-13 00:39:28.936
> UTC [172262] LOG:  duration: 660.329 ms  execute :  SELECT
> xmf.file_id, f.size, fp.full_path  FROM ism_x_medium_file xmf  JOIN#011
> ism_files f  ON f.id_file = xmf.file_id  JOIN#011 ism_files_path fp  ON
> f.id_file = fp.file_id  JOIN ism_online o  ON o.file_id = xmf.file_id  WHERE
> xmf.medium_id = 363 AND  xmf.x_media_file_status_id = 1  AND
> o.online_status_id = 3GROUP BY xmf.file_id, f.size,  fp.full_path   LIMIT
> 7265 ;

I doubt such a query could saturate the network. The query time itself isn't
proportional to the result set size.

Moreover, there's only three fields per row and according to their name, I
doubt the row size is really big.

Plus, imagine the result set is that big, chances are that the frontend will
not be able to cope with it as fast as the network, unless the frontend is doing
nothing really fancy with the dataset. So the frontend itself might saturate
before the network, giving some break to the later.

However, if this query time is unusual, that might illustrate some pressure on
the server by some other mean (CPU ? MEM ? IO ?). Detailed metrics would help.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Two node cluster without fencing and no split brain?

2021-07-22 Thread Jehan-Guillaume de Rorthais

On Thu, 22 Jul 2021 13:10:45 +0300
Andrei Borzenkov  wrote:

> On Thu, Jul 22, 2021 at 1:05 PM Jehan-Guillaume de Rorthais
>  wrote:
> > To do some rewording in regard with the current topic: if Pacemaker is able
> > to stop its resources after a quorum lost, it will not reboot, no "death"
> > either. 
> 
> And how exactly is the remaining quorate partition supposed to know
> that this, inquorate, partition has stopped its resources and it is
> now safe to activate these resources somewhere else?

As far as I know, without active fencing, quorate nodes waits for the watchdog
timeout to expire before taking over the resources. So it relies on
watchdog trust if Pacemaker and/or sbd are not able to stop the resources.

However, in this topic, I believe OP is talking about a clustered FS, not a
single resource moving between nodes. So a distributed lock manager is involved
as well to take care of the writes...and that's a terra incognita to me.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Two node cluster without fencing and no split brain?

2021-07-22 Thread Jehan-Guillaume de Rorthais

On Thu, 22 Jul 2021 12:56:40 +0300
Andrei Borzenkov  wrote:

> On Thu, Jul 22, 2021 at 12:43 PM Jehan-Guillaume de Rorthais
>  wrote:
> >
> > On Wed, 21 Jul 2021 12:45:40 -0400
> > Digimer  wrote:
> >  
> > > On 2021-07-21 3:26 a.m., Jehan-Guillaume de Rorthais wrote:  
> > > > Hi,
> > > >
> > > > On Wed, 21 Jul 2021 04:28:30 + (UTC)
> > > > Strahil Nikolov via Users  wrote:
> > > >  
> > > >> Hi,
> > > >> consider using a 3rd system as a Q disk. Also, you can use iscsi from
> > > >> that node as a SBD device, so you will have proper fencing .If you
> > > >> don't have a hardware watchdog device, you can use softdog kernel
> > > >> module for that. Best  
> > > >
> > > > Having 3 nodes for quorum AND watchdog (using softdog in last resort) is
> > > > enough, isn't it?
> > > > But yes, having a shared storage to add a SBD device is even better.
> > > >
> > > > Regards,  
> > >
> > > The third node with storage-based death is a way of creating a fence
> > > configuration.  
> >
> > Yes, poison pill.
> >  
> > > It works because it's fencing, not because it's quorum.  
> >
> > That's not what I said. Two node + sbd is safe. OK.
> >
> > My consideration/question was: 3 nodes + watchdog, without storage-based
> > death, looks good enough to me. Do I miss something?
> >  
> 
> From an integrity point of view it is equivalent to SBD. SBD at the
> end relies on a watchdog as well. So if you have a reliable hardware
> watchdog it should be OK.
> 
> From an operational point of view pacemaker does not have the notion
> of "cluster wide shutdown". Which means - as soon as you stop
> pacemaker on two nodes the third node will commit suicide because it
> goes out of quorum. Last man standing may help here, I have not tried
> it in true three node cluster. This does not happen with SBD as long
> as storage remains accessible.

Ok, understood. Thanks for the details Andrei!
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Two node cluster without fencing and no split brain?

2021-07-22 Thread Jehan-Guillaume de Rorthais

On Wed, 21 Jul 2021 22:02:21 -0400
"Frank D. Engel, Jr."  wrote:

> In OpenVMS, the kernel is aware of the cluster.  As is mentioned in that 
> presentation, it actually stops processes from running and blocks access 
> to clustered storage when quorum is lost, and resumes them appropriately 
> once it is re-established.
> 
> In other words... no reboot, no "death" of the cluster node or special 
> arrangements with storage hardware...  If connectivity is restored, the 
> services are simply resumed.

Well, when losing the quorum, by default Pacemaker stop its local resources.
Considering a clustered storage, the resources are the lock manager, iscsi or
some other mean, FS etc.

However, if the resources stop actions doesn't succeed, THEN the node reset
itself. Should your cluster have active fencing, the node might be reset by some
external mean.

As Digimer wrote, «Quorum is a tool for when things are working predictably».
To do some rewording in regard with the current topic: if Pacemaker is able to
stop its resources after a quorum lost, it will not reboot, no "death" either.

> I had a 3-node OpenVMS cluster running virtualized at one point on the 
> hobbyist license and my cluster storage for that setup was simply to 
> mirror the disks across the three nodes (via software which is 
> integrated into OpenVMS); almost like RAID 1 across the network.  If I 
> "broke" the cluster and one of the servers lost quorum (due to 
> connectivity) it would just sit and wait for the connectivity to be 
> restored, then resync the storage and pick up essentially where it left off.

I believe this might be possible using a Pacemaker stack. However, I never
built such a cluster. So hopefully some other people around there with more
experience on clustered FS will infirm or confirm with some more details.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Two node cluster without fencing and no split brain?

2021-07-22 Thread Jehan-Guillaume de Rorthais

On Wed, 21 Jul 2021 12:45:40 -0400
Digimer  wrote:

> On 2021-07-21 3:26 a.m., Jehan-Guillaume de Rorthais wrote:
> > Hi,
> > 
> > On Wed, 21 Jul 2021 04:28:30 + (UTC)
> > Strahil Nikolov via Users  wrote:
> >   
> >> Hi,
> >> consider using a 3rd system as a Q disk. Also, you can use iscsi from that
> >> node as a SBD device, so you will have proper fencing .If you don't have a
> >> hardware watchdog device, you can use softdog kernel module for that. Best
> > 
> > Having 3 nodes for quorum AND watchdog (using softdog in last resort) is
> > enough, isn't it?
> > But yes, having a shared storage to add a SBD device is even better.
> > 
> > Regards,  
> 
> The third node with storage-based death is a way of creating a fence
> configuration. 

Yes, poison pill.

> It works because it's fencing, not because it's quorum.

That's not what I said. Two node + sbd is safe. OK.

My consideration/question was: 3 nodes + watchdog, without storage-based death,
looks good enough to me. Do I miss something?

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Two node cluster without fencing and no split brain?

2021-07-21 Thread Jehan-Guillaume de Rorthais

On Wed, 21 Jul 2021 04:50:09 -0400
"Frank D. Engel, Jr."  wrote:

> OpenVMS can do this sort of thing without a requirement for fencing (you 
> still need a third disk as a quorum device in a 2-node cluster), but 
> Linux (at least in its current form) cannot.

Yes it can, as far as what you are describing.

Pacemaker supports the following architectures without "active" fencing:

* 2 nodes + shared storage (as quorum device + poison pill)
* 3 nodes (for quorum) + watchdog

> From what I can tell the fencing requirements in the Linux solution are
> mainly due to limitations of how deeply the clustering solution is integrated
> into the kernel.

Could you explain what you mean? I'm not sure how the kernel is involved there.
The kernel itself can be the problem. That's why using softdog is discouraged.

> There is an overview here: 
> https://sciinc.com/remotedba/techinfo/tech_presentations/Boot%20Camp%202013/Bootcamp_2013_Comparison%20of%20Red%20Hat%20Clusters%20with%20OpenVMS%20Clusters.pdf

This is a quite old presentation and many things changed. I fairly sure a lot of
this content is outdated. First, corosync evolved a lot. Second RHCS now use
Pacemaker default, not CMAN based HA.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Two node cluster without fencing and no split brain?

2021-07-21 Thread Jehan-Guillaume de Rorthais

Hi,

On Wed, 21 Jul 2021 04:28:30 + (UTC)
Strahil Nikolov via Users  wrote:

> Hi,
> consider using a 3rd system as a Q disk. Also, you can use iscsi from that
> node as a SBD device, so you will have proper fencing .If you don't have a
> hardware watchdog device, you can use softdog kernel module for that. Best

Having 3 nodes for quorum AND watchdog (using softdog in last resort) is enough,
isn't it?
But yes, having a shared storage to add a SBD device is even better.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] Re: QDevice vs 3rd host for majority node quorum

2021-07-15 Thread Jehan-Guillaume de Rorthais

On Thu, 15 Jul 2021 12:46:10 +0200
"Ulrich Windl"  wrote:

> >>> Jehan-Guillaume de Rorthais  schrieb am 15.07.2021 um  
> 10:09 in
> Nachricht <20210715100930.06b45f5b@firost>:
> > Hi all,
> > 
> > On Tue, 13 Jul 2021 19:55:30 + (UTC)
> > Strahil Nikolov  wrote:
> >   
> >> In some cases the third location has a single IP and it makes sense to use
> >>  
> 
> > it  
> >> as QDevice. If it has multiple network connections to that location ‑ use  
> a
> >> full blown node .  
> > 
> > By the way, what's the point of multiple rings in corosync when we can  
> setup
> > bonding or teaming on OS layer?  
> 
> Good question: back in the times of HP-UX and ServiceGuard we had two
> networks, each using bonding to ensure cluster communication.
> With Linux and pacemaker we have the same, BUT corosync (as of SLES15 SP2)
> seems to use them not as redundancy, but in parallel.

Indeed, it does. That's what I've experienced as well with a customer where
bandwidth on LAN was free, but billed on the WAN interface. When I dug for
answers, I found a paper on TOTEM explaining the protocol was using both rings
in a kind of round robin fashion. I don't remember the fine details though.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] QDevice vs 3rd host for majority node quorum

2021-07-15 Thread Jehan-Guillaume de Rorthais

Hi all,

On Tue, 13 Jul 2021 19:55:30 + (UTC)
Strahil Nikolov  wrote:

> In some cases the third location has a single IP and it makes sense to use it
> as QDevice. If it has multiple network connections to that location - use a
> full blown node .

By the way, what's the point of multiple rings in corosync when we can setup
bonding or teaming on OS layer?

I remember some times ago bonding was recommended over corosync rings, because
the totem protocol on multiple rings wasn't as flexible than bonding/teaming
and multiple rings was only useful to corosync/pacemaker where bonding was
useful for all other services on the server.

...But that was before the knet era. Did it changed?

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] Postgres Cluster PAF problems

2021-06-30 Thread Jehan-Guillaume de Rorthais

On Wed, 30 Jun 2021 14:36:29 +0200
damiano giuliani  wrote:

> the replication is async, having a look into the postgres logs seems some
> updates failed cuz no master available.

'Not sure un understand what you mean. As Pacemaker recovered the primary on
the same node, standbys and clients lost their connections for few seconds.

But you should not lose UPDATE/INSERT as the primary has been recovered on the
same node.

> i dont expect resource problems (im investingating ayway), the nodes have
> 200gb RAM , 80 cpu and alot of free hdd space.

RAM, CPU and space doesn't give you 100% security.

> how you guys suggest me to find out why the monitor timed out?

I have no idea. Look at your collected metrics or system logs to pinpoint some
heavy load or abnormal behavior?

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Postgres Cluster PAF problems

2021-06-30 Thread Jehan-Guillaume de Rorthais

Hi,

On Wed, 30 Jun 2021 13:44:28 +0200
damiano giuliani  wrote:

> looks some applications lost connection to the master losing some
> update/insert.
> 
> i found the cause into the logs, the psqld-monitor went timeout after
> 1ms and the master resource been demote, the instance stopped and then
> promoted to master again, generating few seconds of disservices (no master
> during the described process)

This is the normal behaviour after a timeout.

I'm surprised you lost some insert/update though. Maybe this is related to your
PostgreSQL setup (synchronous_commit ? fsync ?)

> i noticed a redundant info:
> Update score of "ltaoperdbsXX" from 990 to 1000 because of a change in the
> replication lag
> seems some kind of network lag?

This is not related. Scores are set based on each standby lag. A standby
received some data faster than another one. Nothing more. But I admit this is
quite chatty in your logs...

> i attached the log could be useful to dig further.
> Can some guys point me on the right direction, should be really appreciate.

Unfortunately, there's nothing in your log that could explain the timeout.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] #clusterlabs IRC channel

2021-05-26 Thread Jehan-Guillaume de Rorthais

On Wed, 26 May 2021 14:30:44 -0500
kgail...@redhat.com wrote:

> Without further comments, we've gone ahead with Libera.Chat as the new
> home of #clusterlabs. There is a new wiki page with the channel
> details:
> 
>  https://wiki.clusterlabs.org/wiki/ClusterLabs_IRC_channel
> 
> so we can just point to that in documentation and such. That way, the
> info only needs to be updated in one place whenever it changes.

For what it worth, Debian is using irc.debian.org as a DNS alias pointing
to oftc. When they moved from freenode to oftc (long long time ago), users had
little to do to follow:

https://lists.debian.org/debian-devel-announce/2006/05/msg00012.html

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] [Problem] In RHEL8.4beta, pgsql resource control fails.

2021-04-28 Thread Jehan-Guillaume de Rorthais

On Wed, 28 Apr 2021 12:00:40 -0500
Ken Gaillot  wrote:

> On Wed, 2021-04-28 at 18:14 +0200, Jehan-Guillaume de Rorthais wrote:
> > Hi all,
> > 
> > It seems to me the concern raised by Ulrich hasn't been discussed:
> > 
> > On Wed, 12 Apr 2021 Ulrich Windl wrote:
> >   
> > > Personally I think an RA calling crm_mon is inherently broken: Will
> > > it ever
> > > pass ocf-tester?  
> 
> Calling the command-line tools in an agent can be OK in some cases. The
> main concerns are:
> 
> * Time-of-check/time-of-use: cluster status can change immediately, so
> the agent should behave reasonably if a query result is incorrect at
> the moment it's used. Ideally there would be no case where the agent
> could incorrectly report success for an action.
> 
> * No commands that *change* the configuration (other than setting node
> attributes) should ever be used. Otherwise there's a potential for an
> infinite loop between the agent and scheduler.
> 
> * It's best to use tools' XML output when available, because that
> should be stable across Pacemaker releases, while the text output may
> not be. Aside from crm_mon, XML output is a recent addition, so some
> consideration must be given to backward compatibility and/or requiring
> a minimum Pacemaker version.
> 
> * Only the configuration section of the CIB has a guaranteed schema.
> The status section can theoretically change from release to release,
> although in practice it has changed very little over the years.
> 
> I don't use ocf-tester so I can't speak to that, but I suspect it could
> work if you exported a CIB_file variable with a sample cluster status
> beforehand. (CIB_file makes the cluster commands act as if the
> specified file is the live CIB at the moment.)
> 
> > Would it be possible to rely on the following command ?
> > 
> >   cibadmin --query --xpath "//status/node_state[@join='member']" | \
> > grep -Po 'uname="\K[^"]+'
> > 
> > 
> > Regards,  
> 
> Only full cluster nodes will have a "join" attribute, so that query
> won't catch active remote nodes or guest nodes. Whether that's good or
> bad depends on what you're looking for.

That was an example to remove the crm_mon dependency with the cibadmin one.
AFAIU this agent, it uses crm_mon to:

* look for the node hosting the promoted clone
* look for a node existence
* look for a node fully joined

all of these use seems accessible by parsing the cibadmin status section
output (or --xpath).

> The plus side is that it's a query and it returns XML.

indeed.

> The downsides are that node status can change quickly, so it could
> theoretically be inaccurate a moment later when you use it, and the
> status section is not guaranteed to stay in that format (though I
> expect that particular part will).

There's already version checks in pgsql RA code for crm_mon anyway, relying on
OCF_RESKEY_crm_feature_set.

> A minor point: that query will return the entire node_state XML
> subtree; you can add -n/--no-children to return just the node_state
> element itself.

Nice!

I was playing with xmllint as well, for an expanded support of xmllint, but it
would add a strong dependency.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] [Problem] In RHEL8.4beta, pgsql resource control fails.

2021-04-28 Thread Jehan-Guillaume de Rorthais

Hi all,

It seems to me the concern raised by Ulrich hasn't been discussed:

On Wed, 12 Apr 2021 Ulrich Windl wrote:

> Personally I think an RA calling crm_mon is inherently broken: Will it ever
> pass ocf-tester?

Would it be possible to rely on the following command ?

  cibadmin --query --xpath "//status/node_state[@join='member']" | \
grep -Po 'uname="\K[^"]+'

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Autostart/Enabling of Pacemaker and corosync

2021-04-27 Thread Jehan-Guillaume de Rorthais

On Mon, 26 Apr 2021 18:04:41 + (UTC)
Strahil Nikolov  wrote:

> I prefer that the stack is auto enabled. Imagine that you got a DB that is
> replicated and primary DB node is fenced. You would like that node to join
> the cluster and if possible to sync with the new primary instead of staying
> down.

In the case of PostgreSQL, the failing primary may not be able to failback
automatically with the new primary. Worse, if it actually enters in
replication, it might just silently become a corrupted standby, giving a wrong
feeling of safety, until a new failover occurs.

PAF doesn't handle auto-failback (eg. pg_rewind) per design, to avoid code
complexity. We don't want to give a wrong feeling of perfect
full-availability/failback/fully-automated-admin'ed PgSQL cluster. If something
went wrong with your DB, you better need to check and fix it. You need both
system and DBA guy on board to take care of the availability and safety of your
cluster.

Note that auto-failback of secondary nodes is safe, as far as they are able
to actually follow up with the production. Maybe we can imaginer some safety
belts in PAF's code to allow Pacemaker auto-start on boot, but refuse
to start a badly shaped PostgreSQL.

> One such example is the SAP HANA DB. Imagine that the current primary
> node looses storage and it failed to commit all transactions to disk. Without
> replication you will endure data loss for the last 1-2 minutes (depends on
> your monitoring interval) unless you got a replication.

PAF is a shared-nothing approach, it requires replication between nodes.

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: Re: Antw: [EXT] best practice for scripting

2021-04-13 Thread Jehan-Guillaume de Rorthais

On Tue, 13 Apr 2021 12:17:38 +0200
"Ulrich Windl"  wrote:
[...]
> >good for SUSE! unfortunately RHEL didn't include the utility...  
> 
> Technically it should work, but there could be "political" reasons.

Few years ago, it was more incompatibilities reasons than political one.

I'm not sure of the status of crmsh today. It might worth a try.

Regards,

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF resource agent & stickiness - ?

2021-04-12 Thread Jehan-Guillaume de Rorthais

On Sun, 11 Apr 2021 16:03:34 +0100
lejeczek  wrote:

> On 10/04/2021 16:19, Jehan-Guillaume de Rorthais wrote:
> >
> > Le 10 avril 2021 14:22:34 GMT+02:00, lejeczek  a
> > écrit :  
> >> Hi guys.
> >>
> >> Any users perhaps experts on PAF agent if happen to read
> >> this - a question - with pretty regular 3-node cluster when
> >> node on which "master" runs goes down then cluster/agent
> >> successfully moves 'master' to a next node.  
> > How did you take down the node ?
> > And what are the scores at this time ?  
> It still boggles my mind, not being an expert though for a 
> while have 'pacemaker' run in my setups, how relationships 
> and dependencies between resource work.
> I've had another resource which I preferred to hold to a 
> specific node and also colocation constraint where that 
> resource was to run with vIP of pgsqld
> 
>   HA-10-3-1-226 with HA-10-1-1-226 (score:INFINITY) 
> (id:colocation-HA-10-3-1-226-HA-10-1-1-226-INFINITY)
> 
> HA-10-1-1-226 is pgsqld's
> 
> logic of that just brakes my mental health :)
> So cancel and mark as non-existent the issue I originally 
> raised.
> many thanks, L.

Colocation with infinite score might be tricky to understand, depending on what
resource must colocate with the other one (or the other way around), with what
other properties, etc.

A quick quote from doc:

  > Remember, because INFINITY was used, if B can’t run on any of the cluster
  > nodes (for whatever reason) then A will not be allowed to run. Whether A is
  > running or not has no effect on B.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF resource agent & stickiness - ?

2021-04-12 Thread Jehan-Guillaume de Rorthais

On Sun, 11 Apr 2021 04:21:02 + (UTC)
Strahil Nikolov  wrote:

> Better check for a location constraint created via 'pcs resource move'!pcs
> constraint location --full | grep cli Best Regards,Strahil Nikolov

Oh, yes this is a good one, this should probably enters in our FAQ.

Thanks,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] PAF resource agent & stickiness - ?

2021-04-10 Thread Jehan-Guillaume de Rorthais



Le 10 avril 2021 14:22:34 GMT+02:00, lejeczek  a écrit :
>Hi guys.
>
>Any users perhaps experts on PAF agent if happen to read 
>this - a question - with pretty regular 3-node cluster when 
>node on which "master" runs goes down then cluster/agent 
>successfully moves 'master' to a next node. 

How did you take down the node ?
And what are the scores at this time ?

> but..
>When node which held master, which was off/down, comes back 
>up then master/agent reshuffles PAF resources again and 
>moves 'master' back that last 'master' node and...
>slapping a 'stickiness' on the resource does not seem to 
>have any affect, cluster/agent still persists on moving 
>'master' to that same node.

 Could you share your cluster setup ?

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Community adoption of PAF vs pgsql

2021-03-26 Thread Jehan-Guillaume de Rorthais

Hi,

I'm one of the PAF author, so I'm biased.

On Fri, 26 Mar 2021 14:51:28 +
Isaac Pittman  wrote:

> My team has the opportunity to update our PostgreSQL resource agent to either
> PAF (https://github.com/ClusterLabs/PAF) or pgsql
> (https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/pgsql),
> and I've been charged with comparing them.

In my opinion, you should spend time to actually build some "close-to-prod"
clusters and train them. Then you'll be able to choose base on some team
experience.

Both agent have very different spirit and very different administrative tasks.

Break your cluster, make some switchover, some failover, how to failback a node
and so on.

> After searching various mailing lists and reviewing the code and
> documentation, it seems like either could suit our needs and both are
> actively maintained.
> 
> One factor that I couldn't get a sense of is community support and adoption:
> 
>   *   Does PAF or pgsql enjoy wider community support or adoption, especially
> for new projects? (I would expect many older projects to be on pgsql due to
> its longer history.)

Sadly, I have absolutely no clues...

>   *   Does either seem to be on the road to deprecation?

PAF is not on its way to deprecation, I have a pending TODO list for it.

I would bet pgsql is not on its way to deprecation either, but I can't speak
for the real authors.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Q: Is there any plan for pcs to support corosync-notifyd?

2021-03-18 Thread Jehan-Guillaume de Rorthais

On Thu, 18 Mar 2021 17:29:59 +0900
井上和徳  wrote:

> On Tue, Mar 16, 2021 at 10:23 PM Jehan-Guillaume de Rorthais
>  wrote:
> >  
> > > On Tue, 16 Mar 2021, 09:58 井上和徳,  wrote:
> > >  
> > > > Hi!
> > > >
> > > > Cluster (corosync and pacemaker) can be started with pcs,
> > > > but corosync-notifyd needs to be started separately with systemctl,
> > > > which is not easy to use.  
> >
> > Maybe you can add to the [Install] section of corosync-notifyd a dependency
> > with corosync? Eg.:
> >
> >   WantedBy=corosync.service
> >
> > (use systemctl edit corosync-notifyd)
> >
> > Then re-enable the service (without starting it by hands).  
> 
> I appreciate your proposal. How to use WantedBy was helpful!
> However, since I want to start the cluster (corosync, pacemaker) only
> manually, it is unacceptable to start corosync along with corosync-notifyd at
> OS boot time.

This is perfectly fine.

I suppose corosync-notifyd is starting because the default service config has:

  [Install]
  WantedBy=multi-user.target

If you want corosync-notifyd to be enabled ONLY on corosync startup, but noton
system startup, you have to remove this startup dependency on "multi-user"
target. So, your drop-in setup of corosync-notifyd shoudl be (remove leading
spaces):

  cat <https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Q: Is there any plan for pcs to support corosync-notifyd?

2021-03-16 Thread Jehan-Guillaume de Rorthais

> On Tue, 16 Mar 2021, 09:58 井上和徳,  wrote:
> 
> > Hi!
> >
> > Cluster (corosync and pacemaker) can be started with pcs,
> > but corosync-notifyd needs to be started separately with systemctl,
> > which is not easy to use.

Maybe you can add to the [Install] section of corosync-notifyd a dependency
with corosync? Eg.:

  WantedBy=corosync.service

(use systemctl edit corosync-notifyd)

Then re-enable the service (without starting it by hands).
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] maximum token value (knet)

2021-03-12 Thread Jehan-Guillaume de Rorthais

On Thu, 11 Mar 2021 17:51:15 + (UTC)
Strahil Nikolov  wrote:

> Interesting...
> Yet, this doesn't explain why token of 3 causes the nodes to never
> assemble a cluster (waiting for half an hour, using wait_for_all=1) , while
> setting it to 29000 works like a charm.
> 
> Thankfully we got RH subsciption, so RH devs will provide more detailed
> output on the issue.

As far as I understand and remember, Honza is actually a RH dev working on pcmk
stack :)

By pure technical interest and curiosity, I would be interested you or Honza
keep the list informed with the issue details and resolution.

Thanks!
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Stopping all nodes causes servers to migrate

2021-01-26 Thread Jehan-Guillaume de Rorthais

On Tue, 26 Jan 2021 16:15:55 +0100
Tomas Jelinek  wrote:

> Dne 25. 01. 21 v 17:01 Ken Gaillot napsal(a):
> > On Mon, 2021-01-25 at 09:51 +0100, Jehan-Guillaume de Rorthais wrote:
> >> Hi Digimer,
> >>
> >> On Sun, 24 Jan 2021 15:31:22 -0500
> >> Digimer  wrote:
> >> [...]
> >>>   I had a test server (srv01-test) running on node 1 (el8-a01n01),
> >>> and on
> >>> node 2 (el8-a01n02) I ran 'pcs cluster stop --all'.
> >>>
> >>>It appears like pacemaker asked the VM to migrate to node 2
> >>> instead of
> >>> stopping it. Once the server was on node 2, I couldn't use 'pcs
> >>> resource
> >>> disable ' as it returned that that resource was unmanaged, and
> >>> the
> >>> cluster shut down was hung. When I directly stopped the VM and then
> >>> did
> >>> a 'pcs resource cleanup', the cluster shutdown completed.
> >>
> >> As actions during a cluster shutdown cannot be handled in the same
> >> transition
> >> for each nodes, I usually add a step to disable all resources using
> >> property
> >> "stop-all-resources" before shutting down the cluster:
> >>
> >>pcs property set stop-all-resources=true
> >>pcs cluster stop --all
> >>
> >> But it seems there's a very new cluster property to handle that
> >> (IIRC, one or
> >> two releases ago). Look at "shutdown-lock" doc:
> >>
> >>[...]
> >>some users prefer to make resources highly available only for
> >> failures, with
> >>no recovery for clean shutdowns. If this option is true, resources
> >> active on a
> >>node when it is cleanly shut down are kept "locked" to that node
> >> (not allowed
> >>to run elsewhere) until they start again on that node after it
> >> rejoins (or
> >>for at most shutdown-lock-limit, if set).
> >>[...]
> >>
> >> [...]
> >>>So as best as I can tell, pacemaker really did ask for a
> >>> migration. Is
> >>> this the case?
> >>
> >> AFAIK, yes, because each cluster shutdown request is handled
> >> independently at
> >> node level. There's a large door open for all kind of race conditions
> >> if
> >> requests are handled with some random lags on each nodes.
> > 
> > I'm going to guess that's what happened.
> > 
> > The basic issue is that there is no "cluster shutdown" in Pacemaker,
> > only "node shutdown". I'm guessing "pcs cluster stop --all" sends
> > shutdown requests for each node in sequence (probably via systemd), and
> > if the nodes are quick enough, one could start migrating off resources
> > before all the others get their shutdown request.
> 
> Pcs is doing its best to stop nodes in parallel. The first 
> implementation of this was done back in 2015:
> https://bugzilla.redhat.com/show_bug.cgi?id=1180506
> Since then, we moved to using curl for network communication, which also 
> handles parallel cluster stop. Obviously, this doesn't ensure the stop 
> command arrives to and is processed on all nodes at the exactly same time.
> 
> Basically, pcs sends 'stop pacemaker' request to all nodes in parallel 
> and waits for it to finish on all nodes. Then it sends 'stop corosync' 
> request to all nodes in parallel.

How about adding a step to set/remove "stop-all-resources" on cluster
shutdown/start ? This step could either be optional with a new cli argument, or
added when --all is given for these commands.

Thoughts?
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] Re: Stopping all nodes causes servers to migrate

2021-01-25 Thread Jehan-Guillaume de Rorthais

On Mon, 25 Jan 2021 10:22:20 +0100
"Ulrich Windl"  wrote:

> Maybe it's time for  target-role=stopped">... in CIB ;-)

Could you elaborate on what would be the differences with "stop‑all‑resources"?

Kind regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Stopping all nodes causes servers to migrate

2021-01-25 Thread Jehan-Guillaume de Rorthais

Hi Digimer,

On Sun, 24 Jan 2021 15:31:22 -0500
Digimer  wrote:
[...]
>  I had a test server (srv01-test) running on node 1 (el8-a01n01), and on
> node 2 (el8-a01n02) I ran 'pcs cluster stop --all'.
> 
>   It appears like pacemaker asked the VM to migrate to node 2 instead of
> stopping it. Once the server was on node 2, I couldn't use 'pcs resource
> disable ' as it returned that that resource was unmanaged, and the
> cluster shut down was hung. When I directly stopped the VM and then did
> a 'pcs resource cleanup', the cluster shutdown completed.

As actions during a cluster shutdown cannot be handled in the same transition
for each nodes, I usually add a step to disable all resources using property
"stop-all-resources" before shutting down the cluster:

  pcs property set stop-all-resources=true
  pcs cluster stop --all

But it seems there's a very new cluster property to handle that (IIRC, one or
two releases ago). Look at "shutdown-lock" doc:

  [...]
  some users prefer to make resources highly available only for failures, with
  no recovery for clean shutdowns. If this option is true, resources active on a
  node when it is cleanly shut down are kept "locked" to that node (not allowed
  to run elsewhere) until they start again on that node after it rejoins (or
  for at most shutdown-lock-limit, if set).
  [...]

[...]
>   So as best as I can tell, pacemaker really did ask for a migration. Is
> this the case?

AFAIK, yes, because each cluster shutdown request is handled independently at
node level. There's a large door open for all kind of race conditions if
requests are handled with some random lags on each nodes.

Regards,
-- 
Jehan-Guillaume de Rorthais
Dalibo
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Maintenance mode status in CIB

2020-10-13 Thread Jehan-Guillaume de Rorthais

On Tue, 13 Oct 2020 04:48:04 -0400
Digimer  wrote:

> On 2020-10-13 4:32 a.m., Jehan-Guillaume de Rorthais wrote:
> > On Mon, 12 Oct 2020 19:08:39 -0400
> > Digimer  wrote:
> >   
> >> Hi all,  
> > 
> > Hi you,
> >   
> >>
> >>   I noticed that there appear to be a global "maintenance mode"
> >> attribute under cluster_property_set. This seems to be independent of
> >> node maintenance mode. It seemed to not change even when using
> >> 'pcs node maintenance --all'  
> > 
> > You can set maintenance-mode using:
> > 
> >   pcs property set maintenance-mode=true
> > 
> > You can read about "maintenance-mode" cluster attribute and "maintenance"
> > node attribute in chapters:
> > 
> >   
> > https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/Pacemaker_Explained/s-cluster-options.html
> >  
> >   
> > https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/Pacemaker_Explained/_special_node_attributes.html
> > 
> > I would bet the difference is that "maintenance-mode" applies to all nodes
> > in one single action. Using 'pcs node maintenance --all', each pcsd daemon
> > apply the local node maintenance independently. 
> > 
> > With the later, I suppose you might have some lag between nodes to actually
> > start the maintenance, depending on external factors. Moreover, you can
> > start/exit the maintenance mode independently on each nodes.  
> 
> Thanks for this.
> 
> A question remains; Is it possible that:
> 
>  name="maintenance-mode" value="false"/>
> 
> Could be set, and a given node could be:
> 
> 
>   
> 
>   
> 
> 
> That is to say; If the cluster is set to maintenance mode, does that
> mean I should consider all nodes to also be in maintenance mode,
> regardless of what their individual maintenance mode might be set to?

I remember a similar discussion happening some months ago. I believe Ken
answered your question there:

  https://lists.clusterlabs.org/pipermail/developers/2019-November/002242.html

The whole answer is informative, but the conclusion might answer your
question:

  >> There is some room for coming up with better option naming and meaning. For
  >> example maybe the cluster-wide "maintenance-mode" should be something
  >> like "force-maintenance" to make clear it takes precedence over node and
  >> resource maintenance. 

I understand here that "maintenance-mode" takes precedence over individual node
maintenance mode.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Maintenance mode status in CIB

2020-10-13 Thread Jehan-Guillaume de Rorthais

On Mon, 12 Oct 2020 19:08:39 -0400
Digimer  wrote:

> Hi all,

Hi you,

> 
>   I noticed that there appear to be a global "maintenance mode"
> attribute under cluster_property_set. This seems to be independent of
> node maintenance mode. It seemed to not change even when using
> 'pcs node maintenance --all'

You can set maintenance-mode using:

  pcs property set maintenance-mode=true

You can read about "maintenance-mode" cluster attribute and "maintenance" node
attribute in chapters:

https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/Pacemaker_Explained/s-cluster-options.html

https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/Pacemaker_Explained/_special_node_attributes.html

I would bet the difference is that "maintenance-mode" applies to all nodes in
one single action. Using 'pcs node maintenance --all', each pcsd daemon apply
the local node maintenance independently. 

With the later, I suppose you might have some lag between nodes to actually
start the maintenance, depending on external factors. Moreover, you can
start/exit the maintenance mode independently on each nodes.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Tuchanka

2020-10-02 Thread Jehan-Guillaume de Rorthais

On Fri, 2 Oct 2020 15:18:18 +0300
Олег Самойлов  wrote:

> > On 29 Sep 2020, at 11:34, Jehan-Guillaume de Rorthais 
> > wrote:
> > 
> > 
> > Vagrant use virtualbox by default, which supports softdog, but it support
> > many other virtualization plateform, including eg. libvirt/kvm where you
> > can use virtualized watchdog card.
> >   
> >>   
> > 
> > Vagrant can use Chef, Ansible, Salt, puppet, and others to provision VM:
> > 
> >  https://www.vagrantup.com/docs/provisioning
> > 
> > 
> > There many many available vagrant images:
> > https://app.vagrantup.com/boxes/search There's many vagrant image...because
> > building vagrant image is easy. I built some when RH8 wasn't available yet.
> > So if you need special box, with eg. some predefined setup, you can do it
> > quite fast.  
> 
> My english is poor, I'll try to find other words. My primary and main task
> was to create a prototype for an automatic deploy system. So I used only the
> same technique that will be used on the real hardware servers: RedHat dvd
> image + kickstart. And to test such deploying too. That's why I do not use
> any special image for virtual machines.

How exactly using a vagrant box you built yourself is different with
virtualbox where you clone (I suppose) an existing VM you built?

> > Watchdog is kind of a self-fencing method. Cluster with quorum+watchdog, or
> > SBD+watchdog or quorum+SBD+watchdog are fine...without "active" fencing.  
> 
> quorum+watchdog or SBD+watchdog are useless. Quorum+SBD+watchdog is a
> solution, but also has some drawback, so this is not perfect or fine yet.

Well, by "SBD", I meant "Storage Based Death": using a shared storage to poison
pill other nodes. Not just the sbd daemon, that is used for SBD and watchdog.
Sorry for the shortcut and the confusion.

> I'll write about it below.
>   
> >>> Now, in regard with your multi-site clusters and how you deal with it
> >>> using quorum, did you read the chapter about the Cluster Ticket Registry
> >>> in Pacemaker doc ? See:
> >>> 
> >>> https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/Pacemaker_Explained/ch15.html
> >>> 
> >> 
> >> Yep, I read the whole documentation two years ago. Yep, the ticket system
> >> was looked interesting at first glance, but I didn't see a method how to
> >> use it with PAF. :)  
> > 
> > It could be interesting to have detailed feedback about that. Could you
> > share your experience?  
> 
> Heh, I don't have experience of using the ticket system because I can't even
> imaging how to use the ticket system with PAF.

OK

> As about pacemaker without STONITH the idea was simple: quorum + SBD as
> watchdog daemon.

(this was what I describe as "quorum+watchdog", again sorry for the
confusion :))

> More precisely described in the README. Proved by my test
> system this is mostly works. :)
> 
> What are possible caveats. First of all softdog is not good for this (only
> for testing), and system will heavily depend on reliability of the watchdog
> device.

+1

> SBD is not good as watchdog daemon. In my version it does not check
> that the corosync and any processes of the pacemaker are not frozen (for
> instance by kill -STOP). Looked like checking for corosync have been already
> done: https://github.com/ClusterLabs/sbd/pull/83

Good.

> Don't know what about checking all processes of the pacemaker.

This moves toward the good direction I would say:

  https://lists.clusterlabs.org/pipermail/users/2020-August/027602.html

The main Pacemaker process is now checked by sbd. Maybe other processes will be
included in futur releases as "more in-depth health checks" as written in this
email.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Tuchanka

2020-09-29 Thread Jehan-Guillaume de Rorthais

On Fri, 25 Sep 2020 17:20:28 +0300
Олег Самойлов  wrote:

> Sorry for the late reply. I was on leave and after this some problems at my
> work.
> 
> > On 3 Sep 2020, at 17:23, Jehan-Guillaume de Rorthais 
> > wrote:
> > 
> > Hi,
> > 
> > Thanks for sharing.
> > 
> > I had a very quick glance at your project. I wonder if you were aware of
> > some existing projects/scripts that would have save you a lot of time. Or
> > maybe you know them but they did not fit your needs? Here are some pointers:
> > 
> > # PAF vagrant files
> > 
> >  PAF repository have 3 different Vagrant files able to build 3 different
> > kind of clusters using libvirt.
> >  I'm sure you can use Vagrant with Virtualbox for your needs.
> > 
> >  for a demo:
> >  
> > https://blog.ioguix.net/postgresql/2019/01/24/Build-a-PostreSQL-Automated-Failover-in-5-minutes.html
> >   
> 
> Vagrand was the secondary my attempt after Docker. I didn't not use it,
> because I didn't know that it can be used with libvirt and pure virtual
> machines. I need a pure VM, because I need in my schemas a watchdog device,
> at least the softdog.

Vagrant use virtualbox by default, which supports softdog, but it support many
other virtualization plateform, including eg. libvirt/kvm where you can use
virtualized watchdog card.

> Also one of the my tasks was to create a prototype for an automatic
> installation system, which latterly can be converted to Ansible, Salt, Puppet
> or Chef (sysadmins didn't know what to prefer).

Vagrant can use Chef, Ansible, Salt, puppet, and others to provision VM:

  https://www.vagrantup.com/docs/provisioning



> So the prototype of the automatic installation system was written on the pure
> bash.

PAF cluster examples using vagrant provision VM with bash scripts. Eg.:

  
https://github.com/ClusterLabs/PAF/tree/master/extra/vagrant/3nodes-vip/provision

> Installation is performed by the standard installation CentOS DVD image
> (or may be other RedHat compatible) and RedHats so called
> "kickstart" (implemented by VirtualBox). But Vagrant need the special
> preinstalled linux image, as far as I can understand, so it can not be used
> for prototyping an automatic installation system for real servers.

There many many available vagrant images: https://app.vagrantup.com/boxes/search
There's many vagrant image...because building vagrant image is easy. I built
some when RH8 wasn't available yet. So if you need special box, with eg. some
predefined setup, you can do it quite fast.

> As for the automatic test system, yes, I think it can be rewritten to work
> with libvirt instead of VirtualBox. I don't see reasons why not. "PAF vagrant
> files" doesn't have an automatic test system, there is only possibility for
> manual testing.

Yes, Vagrant is only useful to built and provision a cluster quickly (tip: use
a local package mirror when possible). You need another layer to test it.

> An automatic test system is important to look for low
> probable instability, to check new version of software or to play with setup
> parameters. So I can say that this step I already passed 2 years ago. 
> > # CTS
> > 
> >  Cluster Test Suite is provided with pacemaker to run some pre-defined
> > failure scenario against any kind of pacemaker-based cluster. I use it for
> > basic tests with PAF and wrote some doc about how to run it from one of the
> > Vagrant environment provided in PAF repo.
> > 
> >  See:
> >  
> > https://github.com/ClusterLabs/PAF/blob/master/extra/vagrant/README.md#cluster-test-suite
> >   
> 
> Interesting, this is looked like attempt to achieve the same goal, but with
> different method. What is the differences:
> 
> CTS uses Vagrant, while I imitate a kickstart automatic installation on real
> servers.

No, CTS don't use Vagrant. *I* included some CTS tasks in my Vagrantfile to be
able to quickly run CTS tests.

> CTS is written on python, I use bash. They concentrate on testing
> the pacemaker functionality, for instance start/stop nodes in different
> orders. While I concentrate on tests that imitate hardware failures (for
> instance unlink) or other catastrophic failures (out of space, etc).

Yes, that's why being able to extend CTS would be useful to add tests.

> They wrote an universal pacemaker test, but my tests more special for the
> pacemaker+PAF+PostgreSQL cluster. They use STONITH based clusters, while I
> use quorum based clusters to survive a black out of whole datacenter.

No, you can use CTS with any kind of cluster/resources setup.

> Using clusters without STONITH is forbidden in the RedHat documentation and
> is not recommended in the ClusterLabs documentation, that why

Re: [ClusterLabs] Triggering script on cib change

2020-09-16 Thread Jehan-Guillaume de Rorthais

On Wed, 16 Sep 2020 19:57:12 + (UTC)
Strahil Nikolov  wrote:

> Theoretically the CIB is a file on each node,so a script that is looking for
> that file's timestamps or in the cluster's logs should work.

Good one.

This could be simple with a daemon relaying on inotify. Or even simpler,
don't write a daemon, use a systemd path setup (relaying on inotify as well)
that will do the daemon part and call a script for you:

https://www.freedesktop.org/software/systemd/man/systemd.path.html

++
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Triggering script on cib change

2020-09-16 Thread Jehan-Guillaume de Rorthais

On Wed, 16 Sep 2020 02:20:35 -0400
Digimer  wrote:

> Is there a way to invoke a script when something happens with the
> cluster? Be it a simple transition, stonith action, resource dis/enable
> or resovery, etc?

Not exactly a trigger on all CIB changes, but Alerts are triggered on much of
the events you list here:

  
https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/#idm47160765353680
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Pacemaker/corosync with PostgreSQL 12

2020-09-04 Thread Jehan-Guillaume de Rorthais

On Fri, 4 Sep 2020 10:55:31 +0200
Oyvind Albrigtsen  wrote:

> Add the "recovery.conf" parameters to postgresql.conf (except the
> standby one) and touch standby.signal (which does the same thing).

+1

> After you've verified that it's working and stop PostgreSQL you simply
> rm standby.signal and the "recovery.conf" specific parameters,

Why removing standby.signal and the recovery parameters?
The RA should deal with the first, and seconds are ignored on a primary
instance.

> and the resource agent will properly add/remove them when appropriate.

It depend on the agent OP picked. I suppose this is true with the one provided
by the resource-agents project.

If you are using the PAF resource agent, it only deals with standby.signal.

> On 04/09/20 08:47 +, Ларионов Андрей Валентинович wrote:
> [...] or give link to existing documentation

This one is available, adapt to your environment:
https://clusterlabs.github.io/PAF/Quick_Start-CentOS-8.html

Another one should appears soon using Debian10/PgSQL12.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Tuchanka

2020-09-03 Thread Jehan-Guillaume de Rorthais

On Thu, 03 Sep 2020 10:58:54 -0500
Ken Gaillot  wrote:
> [...] there are other cluster test platforms already, but none of them really
> cover everybody's desired scenarios (or is easily extensible).

I thought "ra-tester" was, among other things, about extending CTS with custom
tests? Did you attend this talk? Or maybe Damien Ciabrini is around?

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Tuchanka

2020-09-03 Thread Jehan-Guillaume de Rorthais

Hi,

Thanks for sharing.

I had a very quick glance at your project. I wonder if you were aware of some
existing projects/scripts that would have save you a lot of time. Or maybe you
know them but they did not fit your needs? Here are some pointers:

# PAF vagrant files

PAF repository have 3 different Vagrant files able to build 3 different kind
of clusters using libvirt.
I'm sure you can use Vagrant with Virtualbox for your needs.

for a demo:

https://blog.ioguix.net/postgresql/2019/01/24/Build-a-PostreSQL-Automated-Failover-in-5-minutes.html

# CTS

Cluster Test Suite is provided with pacemaker to run some pre-defined failure
scenario against any kind of pacemaker-based cluster. I use it for basic
tests with PAF and wrote some doc about how to run it from one of the Vagrant
environment provided in PAF repo.

See:

https://github.com/ClusterLabs/PAF/blob/master/extra/vagrant/README.md#cluster-test-suite

# ra-tester:

Damien Ciabrini from RH give a talk about ra-tester, which seems to extends
CTS with customs test, but I hadn't time to give it a look yet. Slides are
available here:
https://wiki.clusterlabs.org/wiki/File:CL2020-slides-Ciabrini-ra-tester.pdf

Now, in regard with your multi-site clusters and how you deal with it using
quorum, did you read the chapter about the Cluster Ticket Registry in Pacemaker
doc ? See:

https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/Pacemaker_Explained/ch15.html

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] Stonith failing

2020-08-18 Thread Jehan-Guillaume de Rorthais

On Tue, 18 Aug 2020 08:21:50 +0200
Klaus Wenninger  wrote:

> On 8/18/20 7:49 AM, Andrei Borzenkov wrote:
> > 17.08.2020 23:39, Jehan-Guillaume de Rorthais пишет:  
> >> On Mon, 17 Aug 2020 10:19:45 -0500
> >> Ken Gaillot  wrote:
> >>  
> >>> On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote:  
> >>>> Thanks to all your suggestions, I now have the systems with stonith
> >>>> configured on ipmi.
> >>> A word of caution: if the IPMI is on-board -- i.e. it shares the same
> >>> power supply as the computer -- power becomes a single point of
> >>> failure. If the node loses power, the other node can't fence because
> >>> the IPMI is also down, and the cluster can't recover.
> >>>
> >>> Some on-board IPMI controllers can share an Ethernet port with the main
> >>> computer, which would be a similar situation.
> >>>
> >>> It's best to have a backup fencing method when using IPMI as the
> >>> primary fencing method. An example would be an intelligent power switch
> >>> or sbd.  
> >> How SBD would be useful in this scenario? Poison pill will not be
> >> swallowed by the dead node... Is it just to wait for the watchdog timeout?
> >>  
> > Node is expected to commit suicide if SBD lost access to shared block
> > device. So either node swallowed poison pill and died or node died
> > because it realized it was impossible to see poison pill or node was
> > dead already. After watchdog timeout (twice watchdog timeout for safety)
> > we assume node is dead.  
> Yes, like this a suicide via watchdog will be triggered if there are
> issues with thedisk. This is why it is important to have a reliable
> watchdog with SBD even whenusing poison pill. As this alone would
> make a single shared disk a SPOF, runningwith pacemaker integration
> (default) a node with SBD will survive despite ofloosing the disk
> when it has quorum and pacemaker looks healthy. As corosync-quorum
> in 2-node-mode obviously won't be fit for this purpose SBD will switch
> to checking for presence of both nodes if 2-node-flag is set.
> 
> Sorry for the lengthy explanation but the full picture is required
> to understand whyit is sufficiently reliable and useful if configured

Thank you Andrei and Klaus for the explanation.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

1 2 3 >

1 - 100 of 284 matches

Mail list logo