On Mon, Jul 4, 2022 at 1:06 AM Reid Wahl <nw...@redhat.com> wrote: > > On Sat, Jul 2, 2022 at 1:12 PM vitaly <vit...@unitc.com> wrote: > > > > Sorry, I noticed that I am missing meta "notice=true" and after adding it > > to postgres-ms configuration "notice" events started to come through. > > Item 1 still needs explanation. As pacemaker-controld keeps complaining. > > What happens when you run `OCF_ROOT=/usr/lib/ocf > /usr/lib/ocf/resource.d/heartbeat/pgsql-rhino meta-data`?
This may also be relevant: https://lists.clusterlabs.org/pipermail/users/2022-June/030391.html > > > Thanks! > > _Vitaly > > > > > On 07/02/2022 2:04 PM vitaly <vit...@unitc.com> wrote: > > > > > > > > > Hello Everybody. > > > I have a 2 node cluster with clone resource “postgres-ms”. We are running > > > following versions of pacemaker/corosync: > > > d19-25-left.lab.archivas.com ~ # rpm -qa | grep "pacemaker\|corosync" > > > pacemaker-cluster-libs-2.0.5-9.el8.x86_64 > > > pacemaker-libs-2.0.5-9.el8.x86_64 > > > pacemaker-cli-2.0.5-9.el8.x86_64 > > > corosynclib-3.1.0-5.el8.x86_64 > > > pacemaker-schemas-2.0.5-9.el8.noarch > > > corosync-3.1.0-5.el8.x86_64 > > > pacemaker-2.0.5-9.el8.x86_64 > > > > > > There are couple of issues that could be related. > > > 1. There are following messages in the logs coming from > > > pacemaker-controld: > > > Jul 2 14:59:27 d19-25-right pacemaker-controld[1489734]: error: Failed > > > to receive meta-data for ocf:heartbeat:pgsql-rhino > > > Jul 2 14:59:27 d19-25-right pacemaker-controld[1489734]: warning: Failed > > > to get metadata for postgres (ocf:heartbeat:pgsql-rhino) > > > > > > 2. ocf:heartbeat:pgsql-rhino does not get any "notice" operations which > > > causes multiple issues with postgres synchronization during availability > > > events. > > > > > > 3. Item 2 raises another question. Who is setting these values: > > > ${OCF_RESKEY_CRM_meta_notify_type} > > > ${OCF_RESKEY_CRM_meta_notify_operation} > > > > > > Here is excerpt from cluster config: > > > > > > d19-25-left.lab.archivas.com ~ # pcs config > > > > > > Cluster Name: > > > Corosync Nodes: > > > d19-25-right.lab.archivas.com d19-25-left.lab.archivas.com > > > Pacemaker Nodes: > > > d19-25-left.lab.archivas.com d19-25-right.lab.archivas.com > > > > > > Resources: > > > Clone: postgres-ms > > > Meta Attrs: promotable=true target-role=started > > > Resource: postgres (class=ocf provider=heartbeat type=pgsql-rhino) > > > Attributes: master_ip=172.16.1.6 > > > node_list="d19-25-left.lab.archivas.com d19-25-right.lab.archivas.com" > > > pgdata=/pg_data remote_wals_dir=/remote/walarchive rep_mode=sync > > > reppassword=XXXXXX repuser=XXXXXXX > > > restore_command="/opt/rhino/sil/bin/script_wrapper.sh wal_restore.py %f > > > %p" tmpdir=/pg_data/tmp wals_dir=/pg_data/pg_wal > > > xlogs_dir=/pg_data/pg_xlog > > > Meta Attrs: is-managed=true > > > Operations: demote interval=0 on-fail=restart timeout=120s > > > (postgres-demote-interval-0) > > > methods interval=0s timeout=5 > > > (postgres-methods-interval-0s) > > > monitor interval=10s on-fail=restart timeout=300s > > > (postgres-monitor-interval-10s) > > > monitor interval=5s on-fail=restart role=Master > > > timeout=300s (postgres-monitor-interval-5s) > > > notify interval=0 on-fail=restart timeout=90s > > > (postgres-notify-interval-0) > > > promote interval=0 on-fail=restart timeout=120s > > > (postgres-promote-interval-0) > > > start interval=0 on-fail=restart timeout=1800s > > > (postgres-start-interval-0) > > > stop interval=0 on-fail=fence timeout=120s > > > (postgres-stop-interval-0) > > > Thank you very much! > > > _Vitaly > > > _______________________________________________ > > > Manage your subscription: > > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > > > ClusterLabs home: https://www.clusterlabs.org/ > > _______________________________________________ > > Manage your subscription: > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > ClusterLabs home: https://www.clusterlabs.org/ > > > > -- > Regards, > > Reid Wahl (He/Him), RHCA > Senior Software Maintenance Engineer, Red Hat > CEE - Platform Support Delivery - ClusterHA -- Regards, Reid Wahl (He/Him), RHCA Senior Software Maintenance Engineer, Red Hat CEE - Platform Support Delivery - ClusterHA _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/