We do the same as Josef - we run the database on a VM (single VM,
MariaDB) and leave it up to (in our case) VMWare to ensure its availability.
Tina
On 25/01/2024 11:34, Josef Dvoracek wrote:
To protect from HW failure, and to have more free hands when upgrading
underlying OS, we use virtualiza
To protect from HW failure, and to have more free hands when upgrading
underlying OS, we use virtualization with "live migration"/HA and
MariaDB server as a VM.
VM is easy to backup, restore as a snapshot, clone for possible tests, etc.
In the past, I deployed (customer-requirement) one site u
com>>
Sent: 22 January 2024 17:23
To: Slurm User Community List
mailto:slurm-users@lists.schedmd.com>>
Subject: [slurm-users] Database cluster
[You don't often get email from
dlhommed...@gmail.com<mailto:dlhommed...@gmail.com>. Learn why this is
important at https://aka.ms/
mailto:dlhommed...@gmail.com>>
> Sent: 22 January 2024 17:23
> To: Slurm User Community List <mailto:slurm-users@lists.schedmd.com>>
> Subject: [slurm-users] Database cluster
>
> [You don't often get email from dlhommed...@gmail.com
> <mailto:dlhomme
22 January 2024 17:23
To: Slurm User Community List
Subject: [slurm-users] Database cluster
[You don't often get email from dlhommed...@gmail.com. Learn why this is
important at https://aka.ms/LearnAboutSenderIdentification ]
Community:
What do you do to ensure database reliabilit
Hi Diego.
In our setup, the database is critical. We have some wrapper scripts that
consult the database for information, and we also set environment variables on
login, based on user/partition associations. If the database is down, none of
those things work.
I doubt there is appetite in the
IIUC the database is not "critical": if it goes down, you lose access to
some statistics. But job data gets cached anyway and the db will be
updated when it comes back online.
Diego
Il 22/01/2024 18:23, Daniel L'Hommedieu ha scritto:
Community:
What do you do to ensure database reliability i
Community:
What do you do to ensure database reliability in your SLURM environment? We
can have multiple controllers and multiple slurmdbds, but my understanding is
that slurmdbd can be configured with a single MySQL server, so what do you do?
Do you have that “single MySQL server” be a clust