Hi Benjamin and Paolo, I would like to discuss changes to DM-Multipath and qemu-pr-helper to handle SCSI Persistent Reservations in QEMU without privileged code.
SCSI Persistent Reservations support in QEMU is built on the qemu-pr-helper daemon that performs PERSISTENT RESERVATION IN and PERSISTENT RESERVATION OUT commands on behalf of the guest. The qemu-pr-helper process provides privilege separation for ioctl(SG_IO)'s CAP_SYS_RAWIO and libmpathpersist's root privileges since the main QEMU process should not have those privileges. There are issues with the current approach: - Privileged code is a security attack surface. - A bunch of code is required for privilege separation and for management tools to set up qemu-pr-helper with access to multipathd. - The interface is SCSI-specific and does not support NVMe. Several of us have pondered a different approach that I will summarize here. The <linux/pr.h> ioctl interface provides an alternative to ioctl(SG_IO) without the CAP_SYS_RAWIO requirement. It supports both SCSI and NVMe. Since privileges are not required, there would be no need for the qemu-pr-helper daemon anymore. The blocker is that <linux/pr.h> is not usable in multipath environments. The Linux DM-Multipath driver has an incomplete ioctl implementation that falls short of what libmpathpersist and multipathd do in userspace. Kernel changes are necessary to fix this. My suggestion is to implement <linux/pr.h> via upcalls from DM-Multipath to multipathd. That way applications like QEMU can consistently use <linux/pr.h> across block device types and no longer have to go through the privileged libmpathpersist interface. Once DM-Multipath support <linux/pr.h> is functional, the main QEMU process can directly invoke the ioctls. qemu-pr-helper will no longer be needed, eliminating privileged code and simplifying the setup required by management tools such as libvirt and KubeVirt. The only loss in functionality that I have identified when switching to <linux/pr.h> is that qemu-pr-helper supports SCSI TransportIDs for the PERSISTENT RESERVATION OUT command. This is not supported by <linux/pr.h>, but I'm not sure how this even works today since the guest sees a virtual SCSI bus and is unaware of the physical bus or HBA. So maybe that was never used in the first place? Does this plan sound good to you? Benjamin: I can work on the DM-Multipath upcalls if you are busy. Thanks, Stefan
signature.asc
Description: PGP signature
