On 1/19/16 1:55 PM, Alan Somers wrote:
On Tue, Jan 19, 2016 at 10:00 AM, Alan Somers <asom...@freebsd.org> wrote:
Author: asomers
Date: Tue Jan 19 17:00:25 2016
New Revision: 294329
URL: https://svnweb.freebsd.org/changeset/base/294329

Log:
   Disallow zvol-backed ZFS pools

   Using zvols as backing devices for ZFS pools is fraught with panics and
   deadlocks. For example, attempting to online a missing device in the
   presence of a zvol can cause a panic when vdev_geom tastes the zvol.  Better
   to completely disable vdev_geom from ever opening a zvol. The solution
   relies on setting a thread-local variable during vdev_geom_open, and
   returning EOPNOTSUPP during zvol_open if that thread-local variable is set.

   Remove the check for MUTEX_HELD(&zfsdev_state_lock) in zvol_open. Its intent
   was to prevent a recursive mutex acquisition panic. However, the new check
   for the thread-local variable also fixes that problem.

   Also, fix a panic in vdev_geom_taste_orphan. For an unknown reason, this
   function was set to panic. But it can occur that a device disappears during
   tasting, and it causes no problems to ignore this departure.

   Reviewed by:  delphij
   MFC after:    1 week
   Relnotes:     yes
   Sponsored by: Spectra Logic Corp
   Differential Revision:        https://reviews.freebsd.org/D4986

Modified:
   head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h
   head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
   head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c
   head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c

Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h

Due to popular demand, I will conditionalize this behavior on a
sysctl, and I won't MFC it.  The sysctl must default to off (ZFS on
zvols not allowed) because having the ability to put pools on zvols
can cause panics even for users who aren't using it.

Thank you!

And let me clear up some confusion:

1) Having the ability to put a zpool on a zvol can cause panics and
deadlocks, even if that ability is unused.
2) Putting a zpool atop a zvol causes unnecessary performance problems
because there are two layers of COW involved, with all their software
complexities.  This also applies to putting a zpool atop files on a
ZFS filesystem.
3) A VM guest putting a zpool on its virtual disk, where the VM host
backs that virtual disk with a zvol, will work fine.  That's the ideal
use case for zvols.
3b) Using ZFS on both host and guest isn't ideal for performance, as
described in item 2.  That's why I prefer to use UFS for VM guests.

The patch as is does very much break the way some people do operations
on zvols.  My script that does virtual machine cloning via snapshots
of zvols containing zpools is currently broken due to this. (I upgraded
one of my dev hosts right after your commit, to verify the broken
behavior.)

In my script, I boot an auto-install .iso into bhyve:

        bhyve -c 2 -m ${vmmem} -H -A -I -g 0 \
                -s 0:0,hostbridge \
                -s 1,lpc -l com1,stdio \
                -s 2:0,virtio-net,${template_tap} \
                -s 3:0,ahci-hd,"${zvol}" \
                -s 4:0,ahci-cd,"${isofile}" \
                ${vmname} || \
                echo "trapped error exit from bhyve: $?"

So, yes, the zpool gets created by the client VM.  Then on
the hypervisor host, the script imports that zpool and renames it,
so that I can have different pool names for all the client VMs.
This step now fails:

+ zpool import -R /virt/base -d /dev/zvol/zdata sys base
cannot import 'sys' as 'base': no such pool or dataset
        Destroy and re-create the pool from
        a backup source.

I import the clients' zpools after the zpools on them has
been renamed, so the hypervisor host can manipulate the
files directly.  It only disturbs a small amount of the
disk blocks on each of the snapshots of the zvol to rename
the zpools.

In this way, I can instantiate ~30 virtual machines from
a custom install.iso image in less than 3 minutes.  And
the bulk of that time is doing the installation from the
custom install.iso into the first virtual machine.  The
cloning of the zvols, and manipulation of the resulting
filesystems is very fast.

-Kurt



_______________________________________________
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Reply via email to