[ https://issues.apache.org/jira/browse/MESOS-8286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16278923#comment-16278923 ]
James Peach edited comment on MESOS-8286 at 12/5/17 6:18 PM: ------------------------------------------------------------- This happens because once the mount namespace is a child of a user namespace, it is considered unprivileged and the {{CL_UNPRIVILEGED}} flag is set on the mount. Once this flag is set, then a remount that changes the flags *must* preserve all the existing flags on the mount (see [do_remount|https://github.com/torvalds/linux/blob/master/fs/namespace.c#L2283]). When we bind mount a file from the host, the mount flags from the host filesystem are inherited to the new bind mount. For example: {noformat} /NetworkManager/resolv.conf /tmp/ExecutorType_UserNamespaceIsolatorTest_ROOT_USER_DockerTask_DefaultExecutor_Ea3QSF/provisioner/containers/2378c60f-d0ab-4144-8df5-46a2f2b5e9fe/containers/a1700239-518d-4908-a7b4-21deda36df8a/backends/overlay/rootfses/12c7b91f-5042-473d-bf38-30f4e9127e08/etc/resolv.conf rw,nosuid,nodev rw,mode=755 tmpfs tmpfs ... Failed to remount bind mount as readonly from '/etc/resolv.conf' to '/tmp/ExecutorType_UserNamespaceIsolatorTest_ROOT_USER_DockerTask_DefaultExecutor_Ea3QSF/provisioner/containers/2378c60f-d0ab-4144-8df5-46a2f2b5e9fe/containers/a1700239-518d-4908-a7b4-21deda36df8a/backends/overlay/rootfses/12c7b91f-5042-473d-bf38-30f4e9127e08/etc/resolv.conf': Operation not permitted {noformat} Updating the {{MS_RDONLY}} flag fails because although {{MS_REMOUNT}} only implements changing {{MS_RDONLY}}, it actually checks that all the per-mount flags were preserved and we omitted the inherited {{MS_NOSUID|MS_NODEV}} flags. was (Author: jamespeach): This happens because once the mount namespace is a child of a user namespace, it is considered unprivileged and the {{CL_UNPRIVILEGED}} flag is set on the mount. Once this flag is set, then a remount that changes the flags *must* preserve all the existing flags on the mount (see [do_remount|https://github.com/torvalds/linux/blob/master/fs/namespace.c#L2283]). When we bind mount a file from the host, the mount flags from the host filesystem are inherited to the new bind mount. For example: {noformat} /NetworkManager/resolv.conf /tmp/ExecutorType_UserNamespaceIsolatorTest_ROOT_USER_DockerTask_DefaultExecutor_Ea3QSF/provisioner/containers/2378c60f-d0ab-4144-8df5-46a2f2b5e9fe/containers/a1700239-518d-4908-a7b4-21deda36df8a/backends/overlay/rootfses/12c7b91f-5042-473d-bf38-30f4e9127e08/etc/resolv.conf rw,nosuid,nodev rw,mode=755 tmpfs tmpfs ... Failed to remount bind mount as readonly from '/etc/resolv.conf' to '/tmp/ExecutorType_UserNamespaceIsolatorTest_ROOT_USER_DockerTask_DefaultExecutor_Ea3QSF/provisioner/containers/2378c60f-d0ab-4144-8df5-46a2f2b5e9fe/containers/a1700239-518d-4908-a7b4-21deda36df8a/backends/overlay/rootfses/12c7b91f-5042-473d-bf38-30f4e9127e08/etc/resolv.conf': Operation not permitted {noformat} Updating the {{MS_RDONLY}} flag fails because {{MS_REMOUNT}} actually updates all the flags and we omitted the inherited {{MS_NOSUID|MS_NODEV}} flags. > Making bind mounts readonly fails with user namespaces. > ------------------------------------------------------- > > Key: MESOS-8286 > URL: https://issues.apache.org/jira/browse/MESOS-8286 > Project: Mesos > Issue Type: Improvement > Reporter: James Peach > Assignee: James Peach > > When user namespaces are in effect, the additional mounts performed by the > CNI isolator to bind host network files read-only fail. The initial bind > mount succeeds, but the subsequent remount is failing. The reason for the > failure isn't clear to me - there are a number of kernel checks and I don't > know which one is failing yet. > {noformat} > ... > [pid 15609] execve("/home/jpeach/src/mesos/build/src/mesos-containerizer", > ["/home/jpeach/src/mesos/build/src"..., "launch"], 0x7f74a001c450 /* 30 vars > */I1130 17:04:34.281958 15537 containerizer.cpp:2921] Transitioning the state > of container > 0a0fdd6b-9532-4010-913b-5e36cad6f666.c4b9a777-eb6c-4c4a-9c4c-5d39e23373eb > from PREPARING to ISOLATING > ) = 0 > strace: Process 15610 attached > [pid 15610] execve("/home/jpeach/src/mesos/build/src/mesos-containerizer", > ["mesos-containerizer", "network-cni-setup", "--bind_host_files=false", > "--bind_readonly=true", "--etc_hostname_path=/etc/hostnam"..., > "--etc_hosts_path=/etc/hosts", "--etc_resolv_conf=/etc/resolv.co"..., > "--help=false", "--pid=15609", "--rootfs=/tmp/ExecutorType_UserN"...], > 0x58f07f0 /* 24 vars */) = 0 > [pid 15610] mount(NULL, "/", NULL, MS_REC|MS_SLAVE, NULL) = 0 > [pid 15610] mount("/etc/resolv.conf", > "/tmp/ExecutorType_UserNamespaceIsolatorTest_ROOT_USER_DockerTask_DefaultExecutor_IMJpTh/provisioner/containers/0a0fdd6b-9532-4010-913b-5e36cad6f666/containers/c4b9a777-eb6c-4c4a-9c4c-5d39e23373eb/backends/overlay/rootfses/0aaba267-75e7-444a-9f3a-adb22adcf195/etc/resolv.conf", > NULL, MS_BIND, NULL) = 0 > [pid 15610] mount(NULL, > "/tmp/ExecutorType_UserNamespaceIsolatorTest_ROOT_USER_DockerTask_DefaultExecutor_IMJpTh/provisioner/containers/0a0fdd6b-9532-4010-913b-5e36cad6f666/containers/c4b9a777-eb6c-4c4a-9c4c-5d39e23373eb/backends/overlay/rootfses/0aaba267-75e7-444a-9f3a-adb22adcf195/etc/resolv.conf", > NULL, MS_RDONLY|MS_REMOUNT, NULL) = -1 EPERM (Operation not permitted) > [pid 15610] +++ exited with 1 +++ > ... > {noformat} > Note that in this log I've experimentally modified the mount flags, but that > doesn't make any difference. -- This message was sent by Atlassian JIRA (v6.4.14#64029)