On Thu, Jan 28, 2021 at 06:44:16PM +0100, Greg Kurz wrote: > On Wed, 27 Jan 2021 11:21:31 +0000 > Stefan Hajnoczi <stefa...@redhat.com> wrote: > > > A well-behaved FUSE client does not attempt to open special files with > > FUSE_OPEN because they are handled on the client side (e.g. device nodes > > are handled by client-side device drivers). > > > > The check to prevent virtiofsd from opening special files is missing in > > a few cases, most notably FUSE_OPEN. A malicious client can cause > > virtiofsd to open a device node, potentially allowing the guest to > > escape. This can be exploited by a modified guest device driver. It is > > not exploitable from guest userspace since the guest kernel will handle > > special files inside the guest instead of sending FUSE requests. > > > > This patch adds the missing checks to virtiofsd. This is a short-term > > solution because it does not prevent a compromised virtiofsd process > > from opening device nodes on the host. > > > > Reported-by: Alex Xu <a...@alxu.ca> > > Fixes: CVE-2020-35517 > > Reviewed-by: Dr. David Alan Gilbert <dgilb...@redhat.com> > > Reviewed-by: Vivek Goyal <vgo...@redhat.com> > > Signed-off-by: Stefan Hajnoczi <stefa...@redhat.com> > > --- > > v3: > > * Protect lo_create() [Greg] > > v2: > > * Add doc comment clarifying that symlinks are traversed client-side > > [Daniel] > > > > This issue was diagnosed on public IRC and is therefore already known > > and not embargoed. > > > > A stronger fix, and the long-term solution, is for users to mount the > > shared directory and any sub-mounts with nodev, as well as nosuid and > > noexec. Unfortunately virtiofsd cannot do this automatically because > > bind mounts added by the user after virtiofsd has launched would not be > > detected. I suggest the following: > > > > 1. Modify libvirt and Kata Containers to explicitly set these mount > > options. > > 2. Then modify virtiofsd to check that the shared directory has the > > necessary options at startup. Refuse to start if the options are > > missing so that the user is aware of the security requirements. > > > > As a bonus this also increases the likelihood that other host processes > > besides virtiofsd will be protected by nosuid/noexec/nodev so that a > > malicious guest cannot drop these files in place and then arrange for a > > host process to come across them. > > > > Additionally, user namespaces have been discussed. They seem like a > > worthwhile addition as an unprivileged or privilege-separated mode > > although there are limitations with respect to security xattrs and the > > actual uid/gid stored on the host file system not corresponding to the > > guest uid/gid. > > --- > > tools/virtiofsd/passthrough_ll.c | 104 ++++++++++++++++++++++--------- > > 1 file changed, 74 insertions(+), 30 deletions(-) > > > > diff --git a/tools/virtiofsd/passthrough_ll.c > > b/tools/virtiofsd/passthrough_ll.c > > index 5fb36d9407..054ad439a5 100644 > > --- a/tools/virtiofsd/passthrough_ll.c > > +++ b/tools/virtiofsd/passthrough_ll.c > > @@ -555,6 +555,30 @@ static int lo_fd(fuse_req_t req, fuse_ino_t ino) > > return fd; > > } > > > > +/* > > + * Open a file descriptor for an inode. Returns -EBADF if the inode is not > > a > > + * regular file or a directory. Use this helper function instead of raw > > + * openat(2) to prevent security issues when a malicious client opens > > special > > + * files such as block device nodes. Symlink inodes are also rejected since > > + * symlinks must already have been traversed on the client side. > > + */ > > +static int lo_inode_open(struct lo_data *lo, struct lo_inode *inode, > > + int open_flags) > > +{ > > + g_autofree char *fd_str = g_strdup_printf("%d", inode->fd); > > + int fd; > > + > > + if (!S_ISREG(inode->filetype) && !S_ISDIR(inode->filetype)) { > > + return -EBADF; > > + } > > + > > + fd = openat(lo->proc_self_fd, fd_str, open_flags); > > + if (fd < 0) { > > + return -errno; > > + } > > + return fd; > > +} > > + > > static void lo_init(void *userdata, struct fuse_conn_info *conn) > > { > > struct lo_data *lo = (struct lo_data *)userdata; > > @@ -684,8 +708,7 @@ static void lo_setattr(fuse_req_t req, fuse_ino_t ino, > > struct stat *attr, > > if (fi) { > > truncfd = fd; > > } else { > > - sprintf(procname, "%i", ifd); > > - truncfd = openat(lo->proc_self_fd, procname, O_RDWR); > > + truncfd = lo_inode_open(lo, inode, O_RDWR); > > if (truncfd < 0) { > > goto out_err; > > } > > @@ -1654,9 +1677,11 @@ static void update_open_flags(int writeback, int > > allow_direct_io, > > static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name, > > mode_t mode, struct fuse_file_info *fi) > > { > > + int open_flags = (fi->flags | O_CREAT) & ~O_NOFOLLOW; > > int fd; > > struct lo_data *lo = lo_data(req); > > struct lo_inode *parent_inode; > > + struct lo_inode *existing_inode = NULL; > > struct fuse_entry_param e; > > int err; > > struct lo_cred old = {}; > > @@ -1682,11 +1707,23 @@ static void lo_create(fuse_req_t req, fuse_ino_t > > parent, const char *name, > > > > update_open_flags(lo->writeback, lo->allow_direct_io, fi); > > > > - fd = openat(parent_inode->fd, name, (fi->flags | O_CREAT) & > > ~O_NOFOLLOW, > > - mode); > > + /* First, try to create a new file but don't open existing files */ > > + fd = openat(parent_inode->fd, name, open_flags | O_EXCL, mode); > > err = fd == -1 ? errno : 0; > > + > > lo_restore_cred(&old); > > > > + /* Second, open existing files if O_EXCL was not specified */ > > + if (err == EEXIST && !(fi->flags & O_EXCL)) { > > + existing_inode = lookup_name(req, parent, name); > > No sure about the exact semantics of lookup_name()... > > > + if (existing_inode) { > > IIUC we could stat() an ${name} path in the directory and > it matches an inode we already know about, right ? > > > + fd = lo_inode_open(lo, existing_inode, open_flags); > > + if (fd < 0) { > > + err = -fd; > > + } > > + } > > What if lookup_name() returned false ? This means either there's > no ${name} path, which looks like the race we were discussing > with Miklos, or there's a ${name} but it doesn't match anything > we know... I guess the latter can happen if the ${name} was > created externally but we never had a chance to do a lookup > yet, right ? Shouldn't we do one at this point ? > > For now, it seems that both cases will return EEXIST, which > is likely confusing if O_EXCL was not specified.
lo_rmdir(), lo_unlink(), and lo_rename() all behave this way too. That's another issue that needs to be addressed separately :). I have an idea for unifying lo_open() and lo_create(). It will solve this issue by creating new inodes if necessary. Stefan
signature.asc
Description: PGP signature