On Thu, Jan 26, 2023 at 04:29:10PM +0100, Michal Prívozník wrote:
> On 1/26/23 16:25, Peter Xu wrote:
> > On Thu, Jan 26, 2023 at 02:15:11PM +0000, Dr. David Alan Gilbert wrote:
> >> * Michal Prívozník (mpriv...@redhat.com) wrote:
> >>> On 1/25/23 23:40, Peter Xu wrote:
> >>>> The new /dev/userfaultfd handle is superior to the system call with a
> >>>> better permission control and also works for a restricted seccomp
> >>>> environment.
> >>>>
> >>>> The new device was only introduced in v6.1 so we need a header update.
> >>>>
> >>>> Please have a look, thanks.
> >>>
> >>> I was wondering whether it would make sense/be possible for mgmt app
> >>> (libvirt) to pass FD for /dev/userfaultfd instead of QEMU opening it
> >>> itself. But looking into the code, libvirt would need to do that when
> >>> spawning QEMU because that's when QEMU itself initializes internal state
> >>> and queries userfaultfd caps.
> >>
> >> You also have to be careful about what the userfaultfd semantics are; I
> >> can't remember them - but if you open it in one process and pass it to
> >> another process, which processes address space are you trying to
> >> monitor?
> > 
> > Yes it's a problem.  The kernel always fetches the current mm_struct* which
> > represents the current context of virtual address space when creating the
> > uffd handle (for either the syscall or the ioctl() approach).
> 
> Ah, I did not realize that.
> 
> > 
> > It works only if Libvirt will invoke QEMU as a thread and they'll share the
> > same address space.
> > 
> > Why libvirt would like to do so?
> 
> Well, we tend to pass files as FD more and more, because it allows us to
> give access to "privileged" files to unprivileged process. What I did
> not realize is that userfaultfd is different, not yet another file.

I see.  Yes uffd is special comparing to most of the other fds, IMHO
majorly because it's a resource not being public but closely bound to the
process context of the mm.

There used to have proposals that grant permission to open uffd handle for
other processes, but the security implication was still not fully clear and
that discussion discontinued.

Then the question is whether there is still any scenario that QEMU may not
have privilege to either /dev/userfaultfd or using the syscall.

Thanks,

-- 
Peter Xu


Reply via email to