Re: [ClusterLabs] 2 node mariadb-cluster - constraint-problems ?

Tomas Jelinek Tue, 18 May 2021 06:20:07 -0700



Dne 18. 05. 21 v 14:55 [email protected] napsal(a):

Gesendet: Dienstag, 18. Mai 2021 um 14:49 Uhr
Von: [email protected]
An: [email protected]
Betreff: Re: [ClusterLabs] 2 node mariadb-cluster - constraint-problems ?

Hi Andrei,Hi everybody,

...

and it works great Thanks for the hint.
But the thing I still don't understand is why the cluster demotes is active 
node for a short time when I reenable a node from standby back to unstandby ? 
Is it not possible to join the drbd as secondary without demote the primary for 
a short moment ?


Try adding interleave=true to your clones.


I tried this but it get me an error msg, what is wrong ?

  pcs resource update database_drbd ocf:linbit:drbd drbd_resource=drbd1 
promotable promoted-max=1 promoted-node-max=1 clone-max=2 clone-node-max=1 
notify=true interleave=true

Error: invalid resource options: 'clone-max', 'clone-node-max', 'interleave', 
'notify', 'promoted-max', 'promoted-node-max', allowed options are: 
'adjust_master_score', 'connect_only_after_promote', 'drbd_resource', 
'drbdconf', 'fail_promote_early_if_peer_primary', 
'ignore_missing_notifications', 'remove_master_score_if_peer_primary', 
'require_drbd_module_version_ge', 'require_drbd_module_version_lt', 
'stop_outdates_secondary', 'unfence_extra_args', 'unfence_if_all_uptodate', 
'wfc_timeout', use --force to override


or is it simply:
pcs resource update database_drbd-clone interleave=true ?


Hi fatcharly,

The error comes from the fact that the update command as you used it istrying to update instance attributes (that is options which pacemakerpasses to a resource agent). The agent doesn't define options you named,therefore pcs prints an error.

You want to update meta attributes, which are options pacemaker isprocessing by itself. This is how you do it:


pcs resource meta database_drbd-clone interleave=true


Regards,
Tomas


Any suggestions are welcome

Stay safe and take care

fatcharly

Gesendet: Mittwoch, 12. Mai 2021 um 19:04 Uhr
Von: "Andrei Borzenkov" <[email protected]>
An: [email protected]
Betreff: Re: [ClusterLabs] 2 node mariadb-cluster - constraint-problems ?

On 12.05.2021 17:34, [email protected] wrote:

Hi Andrei, Hi everybody,

Gesendet: Mittwoch, 12. Mai 2021 um 16:01 Uhr
Von: [email protected]
An: [email protected]
Betreff: Re: [ClusterLabs] 2 node mariadb-cluster - constraint-problems ?

Hi Andrei, Hi everybody,

You need order fs_database after promote operation; and as I just found
pacemaker also does not reverse it correctly and executes fs stop and
drbd demote concurrently. So you need additional order constraint to
first stop fs then demote drbd.


is there so good doku about this, I don't know how to archive a "after promote 
operation" and how can I tell the pcs to first dismount the filesystem mountpoint 
and then demote the drbd-device.

ok, so I found something and used this:

pcs constraint order stop fs_logfiles then demote drbd_logsfiles-clone
pcs constraint order stop fs_database then demote database_drbd-clone

and it works great Thanks for the hint.
But the thing I still don't understand is why the cluster demotes is active 
node for a short time when I reenable a node from standby back to unstandby ? 
Is it not possible to join the drbd as secondary without demote the primary for 
a short moment ?


Try adding interleave=true to your clones.


Best regards and take care

fatcharly

Sorry but this is new for me.

Best regards and take care

fatcharly

Gesendet: Dienstag, 11. Mai 2021 um 17:19 Uhr
Von: "Andrei Borzenkov" <[email protected]>
An: [email protected]
Betreff: Re: [ClusterLabs] 2 node mariadb-cluster - constraint-problems ?

On 11.05.2021 17:43, [email protected] wrote:

Hi,

I'm using a CentOS 8.3.2011 with a pacemaker-2.0.4-6.el8_3.1.x86_64 + 
corosync-3.0.3-4.el8.x86_64 and kmod-drbd90-9.0.25-2.el8_3.elrepo.x86_64.
The cluster consists of two nodes which are providing a ha-mariadb with the 
help of two drbd devices for the database and the logfiles. The corosync is 
working over two rings and both machines are virtual kvm-guests.

Problem:
Node susanne is the active node and lisbon is changing from standby to active, 
susanna is trying to demote one drbd-device but is failling to. The cluster is 
working on properly, but the error stays.
This is the what happens:

Cluster Summary:
   * Stack: corosync
   * Current DC: lisbon (version 2.0.4-6.el8_3.1-2deceaa3ae) - partition with 
quo rum
   * Last updated: Tue May 11 16:15:54 2021
   * Last change:  Tue May 11 16:15:42 2021 by root via cibadmin on susanne
   * 2 nodes configured
   * 11 resource instances configured

Node List:
   * Online: [ lisbon susanne ]

Active Resources:
   * HA_IP       (ocf::heartbeat:IPaddr2):        Started susanne
   * Clone Set: database_drbd-clone [database_drbd] (promotable):
     * Masters: [ susanne ]
     * Slaves: [ lisbon ]
   * Clone Set: drbd_logsfiles-clone [drbd_logsfiles] (promotable):
     * drbd_logsfiles    (ocf::linbit:drbd):      Demoting susanne
   * fs_logfiles (ocf::heartbeat:Filesystem):     Started susanne


Presumably fs_logfiles is located on drbd_logfiles, so how comes it is
active while drbd_logfiles is being demoted? Then drbdadm fails to
change status to secondary and RA simply loops forever until timeout.

   * fs_database (ocf::heartbeat:Filesystem):     Started susanne
   * mysql-server        (ocf::heartbeat:mysql):  Started susanne
   * Clone Set: ping_fw-clone [ping_fw]:
     * Started: [ lisbon susanne ]

-------------------------------------------------------------------------------------------
after a few seconds it switches over:

Cluster Summary:
   * Stack: corosync
   * Current DC: lisbon (version 2.0.4-6.el8_3.1-2deceaa3ae) - partition with 
quo rum
   * Last updated: Tue May 11 16:17:59 2021
   * Last change:  Tue May 11 16:15:42 2021 by root via cibadmin on susanne
   * 2 nodes configured
   * 11 resource instances configured

Node List:
   * Online: [ lisbon susanne ]

Active Resources:
   * HA_IP       (ocf::heartbeat:IPaddr2):        Started susanne
   * Clone Set: database_drbd-clone [database_drbd] (promotable):
     * Masters: [ susanne ]
     * Slaves: [ lisbon ]
   * Clone Set: drbd_logsfiles-clone [drbd_logsfiles] (promotable):
     * Masters: [ susanne ]
     * Slaves: [ lisbon ]
   * fs_logfiles (ocf::heartbeat:Filesystem):     Started susanne
   * fs_database (ocf::heartbeat:Filesystem):     Started susanne
   * mysql-server        (ocf::heartbeat:mysql):  Started susanne
   * Resource Group: apache:
     * httpd_srv (ocf::heartbeat:apache):         Started susanne
   * Clone Set: ping_fw-clone [ping_fw]:
     * Started: [ lisbon susanne ]

Failed Resource Actions:
   * drbd_logsfiles_demote_0 on susanne 'error' (1): call=736, status='Timed 
Out'
, exitreason='', last-rc-change='2021-05-11 16:15:42 +02:00', queued=0ms, 
exec=9 0001ms
----------------------------------------------------------------------------------------------


And what you see in logs?

I think it is a constraint-problem, but I can't find it.
This is my config:
[root@susanne pacemaker]# pcs config show Cluster Name: mysql_cluster Corosync 
Nodes:
  susanne lisbon
Pacemaker Nodes:
  lisbon susanne

Resources:
  Resource: HA_IP (class=ocf provider=heartbeat type=IPaddr2)
   Attributes: cidr_netmask=24 ip=192.168.18.154
   Operations: monitor interval=15s (HA_IP-monitor-interval-15s)
               start interval=0s timeout=20s (HA_IP-start-interval-0s)
               stop interval=0s timeout=20s (HA_IP-stop-interval-0s)
  Clone: database_drbd-clone
   Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true 
promoted-max=1 promoted-node-max=1
   Resource: database_drbd (class=ocf provider=linbit type=drbd)
    Attributes: drbd_resource=drbd1
    Operations: demote interval=0s timeout=90 (database_drbd-demote-interval-0s)
                monitor interval=20 role=Slave timeout=20 
(database_drbd-monitor-interval-20)
                monitor interval=10 role=Master timeout=20 
(database_drbd-monitor-interval-10)
                notify interval=0s timeout=90 (database_drbd-notify-interval-0s)
                promote interval=0s timeout=90 
(database_drbd-promote-interval-0s)
                reload interval=0s timeout=30 (database_drbd-reload-interval-0s)
                start interval=0s timeout=240 (database_drbd-start-interval-0s)
                stop interval=0s timeout=100 (database_drbd-stop-interval-0s)
  Clone: drbd_logsfiles-clone
   Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true 
promoted-max=1 promoted-node-max=1
   Resource: drbd_logsfiles (class=ocf provider=linbit type=drbd)
    Attributes: drbd_resource=drbd2
    Operations: demote interval=0s timeout=90 
(drbd_logsfiles-demote-interval-0s)
                monitor interval=20 role=Slave timeout=20 
(drbd_logsfiles-monitor-interval-20)
                monitor interval=10 role=Master timeout=20 
(drbd_logsfiles-monitor-interval-10)
                notify interval=0s timeout=90 
(drbd_logsfiles-notify-interval-0s)
                promote interval=0s timeout=90 
(drbd_logsfiles-promote-interval-0s)
                reload interval=0s timeout=30 
(drbd_logsfiles-reload-interval-0s)
                start interval=0s timeout=240 (drbd_logsfiles-start-interval-0s)
                stop interval=0s timeout=100 (drbd_logsfiles-stop-interval-0s)
  Resource: fs_logfiles (class=ocf provider=heartbeat type=Filesystem)
   Attributes: device=/dev/drbd2 directory=/mnt/clusterfs2 fstype=ext4
   Operations: monitor interval=20s timeout=40s 
(fs_logfiles-monitor-interval-20s)
               start interval=0s timeout=60s (fs_logfiles-start-interval-0s)
               stop interval=0s timeout=60s (fs_logfiles-stop-interval-0s)
  Resource: fs_database (class=ocf provider=heartbeat type=Filesystem)
   Attributes: device=/dev/drbd1 directory=/mnt/clusterfs1 fstype=ext4
   Operations: monitor interval=20s timeout=40s 
(fs_database-monitor-interval-20s)
               start interval=0s timeout=60s (fs_database-start-interval-0s)
               stop interval=0s timeout=60s (fs_database-stop-interval-0s)
  Resource: mysql-server (class=ocf provider=heartbeat type=mysql)
   Attributes: additional_parameters=--bind-address=0.0.0.0 
binary=/usr/bin/mysqld_safe config=/etc/my.cnf datadir=/mnt/clusterfs1/mysql 
pid=/var/lib/mysql/run/mariadb.pid socket=/var/lib/mysql/mysql.sock
   Operations: demote interval=0s timeout=120s (mysql-server-demote-interval-0s)
               monitor interval=20s timeout=30s 
(mysql-server-monitor-interval-20s)
               notify interval=0s timeout=90s (mysql-server-notify-interval-0s)
               promote interval=0s timeout=120s 
(mysql-server-promote-interval-0s)
               start interval=0s timeout=60s (mysql-server-start-interval-0s)
               stop interval=0s timeout=60s (mysql-server-stop-interval-0s)
  Group: apache
   Resource: httpd_srv (class=ocf provider=heartbeat type=apache)
    Attributes: configfile=/etc/httpd/conf/httpd.conf 
statusurl=http://127.0.0.1/server-status
    Operations: monitor interval=10s timeout=20s 
(httpd_srv-monitor-interval-10s)
                start interval=0s timeout=40s (httpd_srv-start-interval-0s)
                stop interval=0s timeout=60s (httpd_srv-stop-interval-0s)
  Clone: ping_fw-clone
   Resource: ping_fw (class=ocf provider=pacemaker type=ping)
    Attributes: dampen=10s host_list=192.168.18.1 multiplier=1000
    Operations: monitor interval=10s timeout=60s (ping_fw-monitor-interval-10s)
                start interval=0s timeout=60s (ping_fw-start-interval-0s)
                stop interval=0s timeout=20s (ping_fw-stop-interval-0s)

Stonith Devices:
Fencing Levels:

Location Constraints:
   Resource: mysql-server
     Constraint: location-mysql-server
       Rule: boolean-op=or score=-INFINITY (id:location-mysql-server-rule)
         Expression: pingd lt 1 (id:location-mysql-server-rule-expr)
         Expression: not_defined pingd (id:location-mysql-server-rule-expr-1)
Ordering Constraints:
   start mysql-server then start httpd_srv (kind:Mandatory) 
(id:order-mysql-server-httpd_srv-mandatory)
   start database_drbd-clone then start drbd_logsfiles-clone (kind:Mandatory) 
(id:order-database_drbd-clone-drbd_logsfiles-clone-mandatory)
   start drbd_logsfiles-clone then start fs_database (kind:Mandatory) 
(id:order-drbd_logsfiles-clone-fs_database-mandatory)


You need order fs_database after promote operation; and as I just found
pacemaker also does not reverse it correctly and executes fs stop and
drbd demote concurrently. So you need additional order constraint to
first stop fs then demote drbd.

   start fs_database then start fs_logfiles (kind:Mandatory) 
(id:order-fs_database-fs_logfiles-mandatory)
   start fs_logfiles then start mysql-server (kind:Mandatory) 
(id:order-fs_logfiles-mysql-server-mandatory)
Colocation Constraints:
   fs_logfiles with drbd_logsfiles-clone (score:INFINITY) 
(with-rsc-role:Master) (id:colocation-fs_logfiles-drbd_logsfiles-clone-INFINITY)
   fs_database with database_drbd-clone (score:INFINITY) (with-rsc-role:Master) 
(id:colocation-fs_database-database_drbd-clone-INFINITY)
   drbd_logsfiles-clone with database_drbd-clone (score:INFINITY) 
(rsc-role:Master) (with-rsc-role:Master) 
(id:colocation-drbd_logsfiles-clone-database_drbd-clone-INFINITY)
   HA_IP with database_drbd-clone (score:INFINITY) (rsc-role:Started) 
(with-rsc-role:Master) (id:colocation-HA_IP-database_drbd-clone-INFINITY)
   mysql-server with fs_database (score:INFINITY) 
(id:colocation-mysql-server-fs_database-INFINITY)
   httpd_srv with mysql-server (score:INFINITY) 
(id:colocation-httpd_srv-mysql-server-INFINITY)
Ticket Constraints:

Alerts:
  No alerts defined

Resources Defaults:
   No defaults set
Operations Defaults:
   No defaults set

Cluster Properties:
  cluster-infrastructure: corosync
  cluster-name: mysql_cluster
  dc-version: 2.0.4-6.el8_3.1-2deceaa3ae
  have-watchdog: false
  last-lrm-refresh: 1620742514
  stonith-enabled: FALSE

Tags:
  No tags defined

Quorum:
   Options:





Any suggestions are welcome

best regards stay safe, take care

fatcharly

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] 2 node mariadb-cluster - constraint-problems ?

Reply via email to