On Tue, Jun 2, 2020, at 5:55 AM, Stefan Hajnoczi wrote:
> 
> Ping Colin. It would be great if you have time to share your thoughts on
> this discussion and explain how you are using this patch.

Yeah sorry about not replying in this thread earlier, this was just a quick 
Friday side project for me and the thread obviously exploded =)

Thinking about this more, probably what would be good enough for now is an 
option to just disable internal containerization/sandboxing.  In fact per the 
discussion our production pipeline runs inside OpenShift 4 and because 
Kubernetes doesn't support user namespaces yet it also doesn't support 
recursive containerization, so we need an option to turn off the internal 
containerization.

Our use case is somewhat specialized - for what we're doing we generally trust 
the guest.  We use VMs for operating system testing and development of content 
we trust, as opposed to e.g. something like kata.

It's fine for us to run virtiofs as the same user/security context as qemu.

So...something like this?  (Only compile tested)

diff --git a/tools/virtiofsd/fuse_i.h b/tools/virtiofsd/fuse_i.h
index 1240828208..603773c505 100644
--- a/tools/virtiofsd/fuse_i.h
+++ b/tools/virtiofsd/fuse_i.h
@@ -51,6 +51,7 @@ struct fuse_session {
     int fd;
     int debug;
     int deny_others;
+    int no_namespaces;
     struct fuse_lowlevel_ops op;
     int got_init;
     struct cuse_data *cuse_data;
diff --git a/tools/virtiofsd/fuse_lowlevel.c b/tools/virtiofsd/fuse_lowlevel.c
index 2dd36ec03b..263134f792 100644
--- a/tools/virtiofsd/fuse_lowlevel.c
+++ b/tools/virtiofsd/fuse_lowlevel.c
@@ -2522,6 +2522,7 @@ static const struct fuse_opt fuse_ll_opts[] = {
     LL_OPTION("-d", debug, 1),
     LL_OPTION("--debug", debug, 1),
     LL_OPTION("allow_root", deny_others, 1),
+    LL_OPTION("--no-namespaces", no_namespaces, 1),
     LL_OPTION("--socket-path=%s", vu_socket_path, 0),
     LL_OPTION("--fd=%d", vu_listen_fd, 0),
     LL_OPTION("--thread-pool-size=%d", thread_pool_size, 0),
@@ -2542,6 +2543,7 @@ void fuse_lowlevel_help(void)
      */
     printf(
         "    -o allow_root              allow access by root\n"
+        "    --no-namespaces            Disable internal use of 
unshare()/clone(UNSHARE)\n"
         "    --socket-path=PATH         path for the vhost-user socket\n"
         "    --fd=FDNUM                 fd number of vhost-user socket\n"
         "    --thread-pool-size=NUM     thread pool size limit (default %d)\n",
diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c
index 3ba1d90984..7c54a9cde3 100644
--- a/tools/virtiofsd/passthrough_ll.c
+++ b/tools/virtiofsd/passthrough_ll.c
@@ -2551,15 +2551,15 @@ static void setup_namespaces(struct lo_data *lo, struct 
fuse_session *se)
     char *tmpdir;
 
     /*
-     * Create a new pid namespace for *child* processes.  We'll have to
-     * fork in order to enter the new pid namespace.  A new mount namespace
-     * is also needed so that we can remount /proc for the new pid
-     * namespace.
-     *
-     * Our UNIX domain sockets have been created.  Now we can move to
-     * an empty network namespace to prevent TCP/IP and other network
-     * activity in case this process is compromised.
-     */
+    * Create a new pid namespace for *child* processes.  We'll have to
+    * fork in order to enter the new pid namespace.  A new mount namespace
+    * is also needed so that we can remount /proc for the new pid
+    * namespace.
+    *
+    * Our UNIX domain sockets have been created.  Now we can move to
+    * an empty network namespace to prevent TCP/IP and other network
+    * activity in case this process is compromised.
+    */
     if (unshare(CLONE_NEWPID | CLONE_NEWNS | CLONE_NEWNET) != 0) {
         fuse_log(FUSE_LOG_ERR, "unshare(CLONE_NEWPID | CLONE_NEWNS): %m\n");
         exit(1);
@@ -2775,6 +2775,8 @@ static void setup_capabilities(void)
 static void setup_sandbox(struct lo_data *lo, struct fuse_session *se,
                           bool enable_syslog)
 {
+    if (se->no_namespaces)
+        return;
     setup_namespaces(lo, se);
     setup_mounts(lo->source);
     setup_seccomp(enable_syslog);

Reply via email to