[MariaDB discuss] Re: Backup Strategy

Gordan Bobic via discuss Tue, 03 Feb 2026 09:47:15 -0800

On Tue, 3 Feb 2026 at 16:12, Marko Mäkelä <[email protected]> wrote:
>
> There are also file systems
> such as XFS that do not support snapshot but do support block cloning,
> a.k.a. copy-on-write files. Making use of that, we could rather
> quickly copy the entire data directory to a new backup directory. On a
> file system that lacks block cloning and snapshots, we must copy
> files, but the copying can make use of Linux system calls.


Unless I am missing something this would not solve the "moving target"
problem that snapshots specifically solve.
But I guess if you could quickly block clone everything and
mariabackup is aware of it, then that would minimize the backup window
during which the redo log is at risk of overflowing.

> By the "ZFS backup target", are you referring to the "zfs send"
> command?

Yes, specifically *incremental* zfs send

> I also found some documentation about "btrfs send" and the

It has been a long time since I looked at btrfs, but I seem to vaguely
recall that it's incrementals still involve reading the entire old and
new files to compute the delta, which is very inefficient,
particularly with databases where updating a single row means having
to re-read the entire tablespace.
ZFS is significantly more advanced than that and only has to read and
send the blocks that have actually changed.

> Microsoft Windows "refsutil streamsnapshot". These would stream a file
> system snapshot to another system.

I'm sure the 0.1% of deployments of MariaDB on Windows could benefit
from that. :-)

> I have also been thinking of implementing a live streaming backup in
> the tar format. Perhaps, for performance reasons, there should be an
> option to create multiple streams in parallel. I am yet to experiment
> with this.

I don't think tar can do that, which is why there is no such thing as
a parallel tar.
And tar can actually be a serious single-threaded bottleneck when you
are using NVMe drives and 10G+ networking.
And the only real tunable only shifts it by about 33% (on x86-64 -
other platforms may be different):
https://shatteredsilicon.net/tuning-tar/
And 33% doesn't really move the needle enough for large fast servers
that run database 10s of terabytes in size.

> > For ZFS servers with object storage targets, you don't get
> > hyper-efficient incrementals, but it is usually reasonably workable,
> > as long as you have enough disk I/O and network bandwidth.
> > We use this tool that we developed in-house, as a counterpart to sanoid:
> > https://github.com/shatteredsilicon/backoid
>
> How would you use this tool with ZFS? Could it also be used with
> btrfs, or to copy any set of files in any file system?

It is specifically ZFS aware. Config is similar to (and heavily
inspired by) sanoid, where you specify the ZFS dataset, and how many
snapshots of what granularity you want to keep, e.g.
```
[zpoolname/mysql]
use_template = production

[template_production]
pattern = ^autosnap_.*
compression = zstd
compression_level = 3
target = s3:production/
retention = 7d
```

And this would back up the most recent 7 daily snapshots (or fewer if
there are fewer than 7), and keep the most recent 7 on the target.
It uses rclone as a universal interface to just about every object
storage target known to man.

By default it creates a zstd compressed tarball of the snapshot contents.
Because tar is a single-threaded bottleneck, it also has the option of
doing a file-by-file directory copy, because this can be done in a
per-file parallel way by rclone.
In that case compression has to be done by rclone (and encryption is
also expected to be handled by rclone config). Until very recently,
rclone only supported gzip which is very sub-optimal, but recent
versions have zstd support, which is vastly better.

So sanoid would take and manage local snapshots, and backoid handles
backing up contents of those snapshots to a configured rclone
endpoint.

> > > I am searching through the net and could not find a good solution for 
> > > Binlogs. Postgres f.e. has WAL Archiving to a remote Storage or onto a 
> > > Master.
> > > How can I do something like this with MariaDB (without MaxScale Binlog 
> > > Router).
> >
> > You can set log_slave_updates on the slave, which will make the slave
> > generate binlogs. The file/offset coordinates won't match the master,
> > but this isn't a problem if you use GTID.
>
> With binlog_storage_engine=innodb
> (https://jira.mariadb.org/browse/MDEV-34705) and a future development
> of innodb_log_archive=ON, it should eventually be possible to write
> only one copy of the binlog information and have something similar to
> the Postgres WAL archiving.

Interesting. The big selling point would be using it for replication
instead of the binlog, because it would be multiples more efficient
(although also less flexible).
(no writing of the separate binlog on the master, and no relay log I/O
on the slave), which is about 25-33% of disk I/O consumed by writes.


-- 
Gordan Bobic
Database Specialist, Shattered Silicon Ltd.
https://shatteredsilicon.net
Follow us:
LinkedIn: https://www.linkedin.com/company/shatteredsilicon
X: https://x.com/ssiliconbg
_______________________________________________
discuss mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[MariaDB discuss] Re: Backup Strategy

Reply via email to