Hi.

On Fri, Mar 17, 2023 at 01:52:34PM +0100, Nicolas George wrote:
> Reco (12023-03-17):
> > - DRBD
> 
> That looks interesting, with “meta-disk device”.
> 
> > - MDADM + iSCSI
> 
> Maybe possible, but not the way you suggest, see below.
> 
> > - zpool attach/detach
> 
> I do not think that is an option. Can you explain how you think it can
> work?

It's similar to MDADM, but with a small bonus and a pile of drawbacks on
top of it.

Create zpool from your device.
Yes, it will destroy the contents of the device, so backup your files
beforehand, and put them back after the creation of zpool.

Use iSCSI/NBD/FCoE/NVMe (basically any network protocol that can provide
a block device to another host) to make your zpool mirrored.
This is done by zpool attach/detach commands.

Small bonus that I've mentioned earlier is "zpool resilvering"
(syncronization between mirror sides) concerns only actual data residing
in a zpool. I.e. if you have 1Tb mirrored zpool which is filled to 200Gb
you will resync 200Gb.
In comparison, mdadm RAID resync will happily read 1Tb from one drive
and write 1Tb to another *unless* you're using mdadm bitmaps.

ZFS/ZPool drawbacks are numerous and well-documented, but I'll mention a
single one - you do not fill your zpool to 100%. In fact, even 90%
capacity of zpool usually equals trouble.


> 
> > mdadm --create /dev/md0 --level=mirror --force --raid-devices=1 \
> >     --metadata=1.0 /dev/local_dev missing
> > 
> > --metadata=1.0 is highly important here, as it's one of the few mdadm
> > metadata formats that keeps said metadata at the end of the device.
> 
> Well, I am sorry to report that you did not read my message carefully
> enough: keeping the metadata at the end of the device is no more an
> option than keeping it at the beginning or in the middle: there is
> already data everywhere on the device.

Not unless you know the magic trick. See below.

> Also, the mdadm command you just gave is pretty explicit that it will
> wipe the local device.

You mean, like this?

# mdadm --create /dev/md127 --level=mirror --force --raid-devices=2 \
        --metadata=1.0 /dev/loop0 missing
mdadm: /dev/loop0 appears to contain an ext2fs file system
       size=1048512K  mtime=Thu Jan  1 00:00:00 1970
Continue creating array?


mdadm lies to you :) This is how it's done.

# tune2fs -l /dev/loop0 | grep 'Block count'
Block count:              262144
# resize2fs /dev/loop0 262128
resize2fs 1.46.2 (28-Feb-2021)
Resizing the filesystem on /dev/loop0 to 262128 (4k) blocks.
The filesystem on /dev/loop0 is now 262128 (4k) blocks long.
# mdadm --create /dev/md127 --level=mirror --force --raid-devices=2 \
        --metadata=1.0 /dev/loop0 missing
mdadm: /dev/loop0 appears to contain an ext2fs file system
       size=1048512K  mtime=Thu Jan  1 00:00:00 1970
Continue creating array? y
mdadm: array /dev/md127 started.

# fsck -f /dev/md127
fsck from util-linux 2.36.1
e2fsck 1.46.2 (28-Feb-2021)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/md127: 11/65536 files (0.0% non-contiguous), 12955/262128 block


And the main beauty of it is that kernel will forbid you to run
"resize2fs /dev/local_dev" as long as MD array is assembled, and
"resize2fs /dev/md127" will take into the account that 16 4k blocks at
the end.

And I'm pretty sure you can reduce your filesystem by 16 4k blocks.

That --metadata=1.0 is the main part of the trick. One can easily shrink
the filesystem from its tail, but it's much harder to do the same from
its head (which you'd have to do with --metadata=1.2).

Reco

Reply via email to