This can happen if things are mis-configured.
realize slurmdbd is merely a daemon that talks to a database.
That database should be HA and separate from the slurmdbd systems.
For our location, we have a central DB server that both slurmdbd systems
point to. In this scenario, it is the db that ensures both are getting
accurate and current information.
Brian Andrus
On 6/27/2022 9:15 AM, taleinterve...@sjtu.edu.cn wrote:
Hi, all:
We noticed that slurmdbd provide the conf option *DbdBackupHost* for
user to set a secondary slurmdbd node. Since slurmdbd is closely
related to database, we wonder will multiple slurmdbd bring up the
split-brain danger, which is the common topic in database
high-available discussion. Will there be any case in which slurmdbd_A
and slurmdbd_B failed to recognize each other’s state and both work as
active node?
Another related question is, when primary slurmdbd node work well,
will standby slurmdbd node write anything to database? If standby
slurmdbd won’t write anything, then is it safe to separately connect
slurmdbd_A to mysql_A, and slurmdbd_B to mysql_B, and using
multi-source replication to sync mysql_A with mysql_B?