[ClusterLabs] Problem with MariaDB cluster

Thomas CAS Thu, 26 Jan 2023 07:39:41 -0800

Hello,

I'm having trouble with a MariaDB cluster (2 nodes, master-slave) on Debian 11.
I don't know what to do anymore.


Environment:

Node1:
 OS: Debian 11
Kernel: 5.10.0-21-amd64 #1 SMP Debian 5.10.162-1 (2023-01-21)
Versions: resource-agents (4.7.0-1), pacemaker (2.0.5-2), corosync (3.1.2-2), 
mariadb (10.5.18-0+deb11u1)

Node2:
 OS: Debian 11
Kernel: 5.10.0-21-amd64 #1 SMP Debian 5.10.162-1 (2023-01-21)
Versions: resource-agents (4.7.0-1), pacemaker (2.0.5-2), corosync (3.1.2-2), 
mariadb (10.5.18-0+deb11u1)

crm configure show as attachment.

Problem:

When I restart Node2 (which is a slave), it goes up correctly in the cluster:

$ crm status
Cluster Summary:
  * Stack: corosync
  * Current DC: Node1 (version 2.0.5-ba59be7122) - partition with quorum
  * Last updated: Thu Jan 26 12:04:57 2023
  * Last change:  Thu Jan 26 11:39:58 2023 by root via cibadmin on Node2
  * 2 nodes configured
  * 3 resource instances configured

Node List:
  * Online: [ Node1 Node2 ]

Full List of Resources:
  * VIP (ocf::heartbeat:IPaddr2):        Started Node1
  * Clone Set: MYSQLREPLICATOR [MYSQL] (promotable):
    * Masters: [ Node1 ]
    * Slaves: [ Node2 ]

But it does not retrieve the replication information. (SHOW SLAVE STATUS; 
returns nothing)
In the Node2 logs, I can see this message that explains that replication is not 
taking place:

Jan 25 16:29:38  mysql(MYSQL)[22862]:    INFO: No MySQL master present - 
clearing replication state
Jan 25 16:29:39  mysql(MYSQL)[22862]:    WARNING: MySQL Slave IO threads 
currently not running.
Jan 25 16:29:39  mysql(MYSQL)[22862]:    ERROR: MySQL Slave SQL threads 
currently not running.
Jan 25 16:29:39  mysql(MYSQL)[22862]:    ERROR: See  for details
Jan 25 16:29:39  mysql(MYSQL)[22862]:    ERROR: ERROR 1200 (HY000) at line 1: 
Misconfigured slave: MASTER_HOST was not set; Fix in config file or with CHANGE 
MASTER TO

>From what I see in the following file, Node2 does not seem to find the master 
>name. So it clears its replication information:

/usr/lib/ocf/resource.d/heartbeat/mysql

        master_host=`echo $OCF_RESKEY_CRM_meta_notify_master_uname|tr -d " "`
        if [ "$master_host" -a "$master_host" != ${NODENAME} ]; then
            ocf_log info "Changing MySQL configuration to replicate from 
$master_host."
            set_master
            start_slave
            if [ $? -ne 0 ]; then
                ocf_exit_reason "Failed to start slave"
                return $OCF_ERR_GENERIC
            fi
        else
            ocf_log info "No MySQL master present - clearing replication state"
            unset_master
        fi

As it is a production environment, I performed a bare metal restore of these 
machines on 2 LAB machines and I have no problem...
In production, there is a lot of writing but the servers are far from being 
saturated.

Thank you in advance for all the help you can give me.

Best regards,

Thomas Cas  |  Technicien du support infogérance
PHONE : +33 3 51 25 23 26       WEB : 
www.ikoula.com/en<https://www.ikoula.com/en>
IKOULA Data Center 34 rue Pont Assy - 51100 Reims - FRANCE
Before printing this letter, think about the impact on the environment!

[Ikoula]<https://www.ikoula.com/en>
[Twitter]<https://twitter.com/ikoula_en> [Linkedin] 
<https://www.linkedin.com/company/ikoula>  [Youtube] 
<http://www.youtube.fr/ikoulanet>  [Pressroom] <https://pressroom.ikoula.com/>  
[Blog] <https://blog.ikoula.com/en>

node 167772802: Node1
node 167772803: Node2
primitive MYSQL mysql \
        params binary="/usr/sbin/mysqld" user=mysql log="/var/log/mysql.log" 
config="/etc/mysql/my.cnf" datadir="/var/lib/mysql" 
pid="/var/run/mysqld/mysqld.pid" socket="/var/run/mysqld/mysqld.sock" 
replication_user=repl_user replication_passwd=XXXX test_user=pacemaker 
test_passwd=XXXX additional_parameters="--bind-address=0.0.0.0" \
        meta migration-threshold=1 \
        meta target-role=Started \
        op start timeout=120 interval=0 \
        op stop timeout=120 interval=0 \
        op monitor role=Master interval=10s timeout=30s \
        op monitor role=Slave interval=20s timeout=30s 
primitive VIP IPaddr2 \
        params ip=<IP> cidr_netmask=16 nic=eth0 \
        op monitor interval=60s \
        meta target-role=Started
ms MYSQLREPLICATOR MYSQL \
        meta target-role=Started master-max=1 master-node-max=1 clone-max=2 
clone-node-max=1 notify=true interleave=true is-managed=true
order MYSQLREPLICATOR_promote_before_VIP Mandatory: MYSQLREPLICATOR:promote 
VIP:start
colocation VIP_ON_MASTER inf: VIP MYSQLREPLICATOR:Master
property cib-bootstrap-options: \
        have-watchdog=false \
        dc-version=2.0.5-ba59be7122 \
        cluster-infrastructure=corosync \
        cluster-name=debian \
        stonith-enabled=false \
        no-quorum-policy=ignore \
        last-lrm-refresh=1674723115 \
        maintenance-mode=false
property mysql_replication: \
        MYSQL_REPL_INFO="Node1|mysql-bin.001812|358"
rsc_defaults rsc-options: \
        resource-stickiness=INFINITY

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Problem with MariaDB cluster

Reply via email to