Hi Benjamin and Paolo,
I would like to discuss changes to DM-Multipath and qemu-pr-helper to
handle SCSI Persistent Reservations in QEMU without privileged code.

SCSI Persistent Reservations support in QEMU is built on the
qemu-pr-helper daemon that performs PERSISTENT RESERVATION IN and
PERSISTENT RESERVATION OUT commands on behalf of the guest. The
qemu-pr-helper process provides privilege separation for ioctl(SG_IO)'s
CAP_SYS_RAWIO and libmpathpersist's root privileges since the main QEMU
process should not have those privileges.

There are issues with the current approach:
- Privileged code is a security attack surface.
- A bunch of code is required for privilege separation and for management
  tools to set up qemu-pr-helper with access to multipathd.
- The interface is SCSI-specific and does not support NVMe.

Several of us have pondered a different approach that I will summarize
here. The <linux/pr.h> ioctl interface provides an alternative to
ioctl(SG_IO) without the CAP_SYS_RAWIO requirement. It supports both
SCSI and NVMe. Since privileges are not required, there would be no need
for the qemu-pr-helper daemon anymore.

The blocker is that <linux/pr.h> is not usable in multipath
environments. The Linux DM-Multipath driver has an incomplete ioctl
implementation that falls short of what libmpathpersist and multipathd
do in userspace. Kernel changes are necessary to fix this.

My suggestion is to implement <linux/pr.h> via upcalls from DM-Multipath
to multipathd. That way applications like QEMU can consistently use
<linux/pr.h> across block device types and no longer have to go through
the privileged libmpathpersist interface.

Once DM-Multipath support <linux/pr.h> is functional, the main QEMU
process can directly invoke the ioctls. qemu-pr-helper will no longer be
needed, eliminating privileged code and simplifying the setup required
by management tools such as libvirt and KubeVirt.

The only loss in functionality that I have identified when switching to
<linux/pr.h> is that qemu-pr-helper supports SCSI TransportIDs for the
PERSISTENT RESERVATION OUT command. This is not supported by
<linux/pr.h>, but I'm not sure how this even works today since the guest
sees a virtual SCSI bus and is unaware of the physical bus or HBA. So
maybe that was never used in the first place?

Does this plan sound good to you?

Benjamin: I can work on the DM-Multipath upcalls if you are busy.

Thanks,
Stefan

Attachment: signature.asc
Description: PGP signature

Reply via email to