Re: [systemd-devel] systemd-nspawn create container under unprivileged user
On Wed, 11.02.15 17:53, Djalal Harouni (tix...@opendz.org) wrote: > On Wed, Feb 11, 2015 at 05:06:56PM +0100, Lennart Poettering wrote: > > On Wed, 11.02.15 13:53, Djalal Harouni (tix...@opendz.org) wrote: > > > > > On Tue, Feb 10, 2015 at 12:52:34PM +0100, Lennart Poettering wrote: > > > > On Thu, 05.02.15 02:03, Vasiliy Tolstov (v.tols...@selfip.ru) wrote: > > > > > > > > > Hello! > > > > > Does it possible to create container as regular user? Oh what > > > > > capabilities > > > > > i need to add to create container not using root? > > > > > > > > Invoking containers without privileges is not supported by nspawn, and > > > > this is unlikely to change, as I fail to see any strong usecase for > > > > this... > > > > > > > > If somebody can englighten me about the usecase for allowing > > > > containers to be run by unprivileged users, I'd be willing to change > > > > my mind though... > > > A quick argument against it, IOW just wait and see! > > > > > > As unprivileged we don't have CAP_SYS_MODULE set, but inside > > > unprivileged containers we are root, and a call to cap_get_flag() on > > > CAP_SYS_MODULE will return CAP_SET! but hey in reality this is not true, > > > we don't have CAP_SYS_MODULE... this will confuse programs running > > > inside containers, we'll have to add more code paths for this special > > > case... and not only CAP_SYS_MODULE, perhaps there are other cases... > > > > Well, but we could drop CAP_SYS_MODULE both before and after setting > > up the userns, so that the cap is missing fro the PID both inside and > > outside of it... > Indeed, yes but still there are other obscure cases, like CAP_SYS_ADMIN, > even if you have it, you won't be able to mount file systems like btrfs > and others, only a subset of virtual filesystems support unprivileged > user mounting... yeh we could drop it too, and it seems that systemd was > adapted recently to work in this situation, but what about other code ? > or if you want todo some sort of system replication inside > container... Well, some mounting is allowed if you have in CAP_SYS_ADMIN, so we can pass this out, I figure... Note that the inability to mount btrfs shouldn't be too limiting, since we don't expose physical devices in nspawn anyway, and what you don't have you cannot mount anyway... Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] systemd-nspawn create container under unprivileged user
On Wed, Feb 11, 2015 at 05:06:56PM +0100, Lennart Poettering wrote: > On Wed, 11.02.15 13:53, Djalal Harouni (tix...@opendz.org) wrote: > > > On Tue, Feb 10, 2015 at 12:52:34PM +0100, Lennart Poettering wrote: > > > On Thu, 05.02.15 02:03, Vasiliy Tolstov (v.tols...@selfip.ru) wrote: > > > > > > > Hello! > > > > Does it possible to create container as regular user? Oh what > > > > capabilities > > > > i need to add to create container not using root? > > > > > > Invoking containers without privileges is not supported by nspawn, and > > > this is unlikely to change, as I fail to see any strong usecase for > > > this... > > > > > > If somebody can englighten me about the usecase for allowing > > > containers to be run by unprivileged users, I'd be willing to change > > > my mind though... > > A quick argument against it, IOW just wait and see! > > > > As unprivileged we don't have CAP_SYS_MODULE set, but inside > > unprivileged containers we are root, and a call to cap_get_flag() on > > CAP_SYS_MODULE will return CAP_SET! but hey in reality this is not true, > > we don't have CAP_SYS_MODULE... this will confuse programs running > > inside containers, we'll have to add more code paths for this special > > case... and not only CAP_SYS_MODULE, perhaps there are other cases... > > Well, but we could drop CAP_SYS_MODULE both before and after setting > up the userns, so that the cap is missing fro the PID both inside and > outside of it... Indeed, yes but still there are other obscure cases, like CAP_SYS_ADMIN, even if you have it, you won't be able to mount file systems like btrfs and others, only a subset of virtual filesystems support unprivileged user mounting... yeh we could drop it too, and it seems that systemd was adapted recently to work in this situation, but what about other code ? or if you want todo some sort of system replication inside container... I guess we'll endup trying to know if this is the real capability or the diminished version... or if we are inside a userns... > Lennart > > -- > Lennart Poettering, Red Hat -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] systemd-nspawn create container under unprivileged user
On Wed, 11.02.15 13:53, Djalal Harouni (tix...@opendz.org) wrote: > On Tue, Feb 10, 2015 at 12:52:34PM +0100, Lennart Poettering wrote: > > On Thu, 05.02.15 02:03, Vasiliy Tolstov (v.tols...@selfip.ru) wrote: > > > > > Hello! > > > Does it possible to create container as regular user? Oh what capabilities > > > i need to add to create container not using root? > > > > Invoking containers without privileges is not supported by nspawn, and > > this is unlikely to change, as I fail to see any strong usecase for > > this... > > > > If somebody can englighten me about the usecase for allowing > > containers to be run by unprivileged users, I'd be willing to change > > my mind though... > A quick argument against it, IOW just wait and see! > > As unprivileged we don't have CAP_SYS_MODULE set, but inside > unprivileged containers we are root, and a call to cap_get_flag() on > CAP_SYS_MODULE will return CAP_SET! but hey in reality this is not true, > we don't have CAP_SYS_MODULE... this will confuse programs running > inside containers, we'll have to add more code paths for this special > case... and not only CAP_SYS_MODULE, perhaps there are other cases... Well, but we could drop CAP_SYS_MODULE both before and after setting up the userns, so that the cap is missing fro the PID both inside and outside of it... Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] systemd-nspawn create container under unprivileged user
On Tue, Feb 10, 2015 at 12:52:34PM +0100, Lennart Poettering wrote: > On Thu, 05.02.15 02:03, Vasiliy Tolstov (v.tols...@selfip.ru) wrote: > > > Hello! > > Does it possible to create container as regular user? Oh what capabilities > > i need to add to create container not using root? > > Invoking containers without privileges is not supported by nspawn, and > this is unlikely to change, as I fail to see any strong usecase for > this... > > If somebody can englighten me about the usecase for allowing > containers to be run by unprivileged users, I'd be willing to change > my mind though... A quick argument against it, IOW just wait and see! As unprivileged we don't have CAP_SYS_MODULE set, but inside unprivileged containers we are root, and a call to cap_get_flag() on CAP_SYS_MODULE will return CAP_SET! but hey in reality this is not true, we don't have CAP_SYS_MODULE... this will confuse programs running inside containers, we'll have to add more code paths for this special case... and not only CAP_SYS_MODULE, perhaps there are other cases... -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] systemd-nspawn create container under unprivileged user
On Thu, 05.02.15 15:48, Vasiliy Tolstov (v.tols...@selfip.ru) wrote: > 2015-02-05 12:44 GMT+03:00 Alban Crequy : > > > Manual page namespaces(7): > > > >Creation of new namespaces using clone(2) and unshare(2) in most > > cases > >requires the CAP_SYS_ADMIN capability. User namespaces are the > >exception: since Linux 3.8, no privilege is required to create a > > user > >namespace. > > > > So as i understand i can't create full featured container with network > under non root user (and not have cap_sys_admin) unprivileged containers are unlikely to ever support that. creating a network interface on the host will necessary require privileges. If you hence want "full network" support (by which i assume you mean veth links and stuff), then you are generally out of luck... You can run nspawn containers without CAP_SYS_ADMIN via nspawn's --drop-capability=CAP_SYS_ADMIN switch. However, YMMY, as the code you run inside of the container must be Ok with that not having those perms and systemd at least until very recently didn't like that at all... Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] systemd-nspawn create container under unprivileged user
On Thu, 05.02.15 02:03, Vasiliy Tolstov (v.tols...@selfip.ru) wrote: > Hello! > Does it possible to create container as regular user? Oh what capabilities > i need to add to create container not using root? Invoking containers without privileges is not supported by nspawn, and this is unlikely to change, as I fail to see any strong usecase for this... If somebody can englighten me about the usecase for allowing containers to be run by unprivileged users, I'd be willing to change my mind though... Note that to my knowledge any support for unprivileged containers has been disabled in the kernel on many distros though including Fedora's, since it's basically one giant security hole. Note that many of machinectl's commands involve polkit checks, which means it's easy to open them up for unprivileged clients. However, in that case the containers would be forked off and maintained privileged, only the clients will be unprivileged... LXC supports unprivileged containers though, this might be an option for you. Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] systemd-nspawn create container under unprivileged user
On 5 February 2015 at 12:48, Vasiliy Tolstov wrote: > > 2015-02-05 12:44 GMT+03:00 Alban Crequy : >> >> Manual page namespaces(7): >> >>Creation of new namespaces using clone(2) and unshare(2) in most >> cases >>requires the CAP_SYS_ADMIN capability. User namespaces are the >>exception: since Linux 3.8, no privilege is required to create a >> user >>namespace. > > > So as i understand i can't create full featured container with network under > non root user (and not have cap_sys_admin) caps like CAP_SYS_ADMIN don't have an global meaning anymore but refers to operations a process can do *in its current namespace*. An unprivileged process (uid!=0, without cap_sys_admin) can join a user namespace and get uid=0 & cap_sys_admin for operations inside the user namespace, but it will still have uid!=0 & !cap_sys_admin for operations in the parent user namespace. user_namespaces(7) contains userns_child_exec.c and it creates a fully featured container with network without being root. (I attached a patched version I was testing) # # Because I'm using the kernel patched by my distribution # echo 1 > /proc/sys/kernel/unprivileged_userns_clone $ gcc -lcap -o userns_child_exec userns_child_exec.c Here it seems to work: alban@alban:~$ ls -l /tmp/userns_child_exec -rwxr-xr-x 1 alban alban 14488 Feb 5 23:24 /tmp/userns_child_exec alban@alban:~$ id -u 1000 alban@alban:~$ ip link # ---> will show lo, eth0, wlan0... alban@alban:~$ /tmp/userns_child_exec -p -m -U -M '0 1000 1' -G '0 1000 1' -n bash About to exec bash root@alban:~# id uid=0(root) gid=0(root) groups=0(root),65534(nogroup) root@alban:~# ip link # ---> only lo visible in this namespace Cheers, Alban --- userns_child_exec.orig.c 2015-02-05 23:20:19.208741366 +0100 +++ userns_child_exec.c 2015-01-30 17:01:56.948493001 +0100 @@ -108,6 +108,30 @@ close(fd); } +static void +write_file(char *content, char *path) +{ +int fd; +size_t content_len; + +content_len = strlen(content); + +fd = open(path, O_RDWR); +if (fd == -1) { +fprintf(stderr, "ERROR: open %s: %s\n", path, +strerror(errno)); +exit(EXIT_FAILURE); +} + +if (write(fd, content, content_len) != content_len) { +fprintf(stderr, "ERROR: write %s: %s\n", content, +strerror(errno)); +exit(EXIT_FAILURE); +} + +close(fd); +} + static int /* Start function for cloned child */ childFunc(void *arg) { @@ -149,6 +173,7 @@ const int MAP_BUF_SIZE = 100; char map_buf[MAP_BUF_SIZE]; char map_path[PATH_MAX]; +char groups_path[PATH_MAX]; /* Parse command-line options. The initial '+' character in the final getopt() argument prevents GNU-style permutation @@ -225,6 +250,11 @@ update_map(uid_map, map_path); } if (gid_map != NULL || map_zero) { +snprintf(groups_path, PATH_MAX, "/proc/%ld/setgroups", +(long) child_pid); +write_file("deny\n", groups_path); +} +if (gid_map != NULL || map_zero) { snprintf(map_path, PATH_MAX, "/proc/%ld/gid_map", (long) child_pid); if (map_zero) { /* userns_child_exec.c Licensed under GNU General Public License v2 or later Create a child process that executes a shell command in new namespace(s); allow UID and GID mappings to be specified when creating a user namespace. */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include /* A simple error-handling function: print an error message based on the value in 'errno' and terminate the calling process */ #define errExit(msg)do { perror(msg); exit(EXIT_FAILURE); \ } while (0) struct child_args { char **argv;/* Command to be executed by child, with args */ intpipe_fd[2]; /* Pipe used to synchronize parent and child */ }; static int verbose; static void usage(char *pname) { fprintf(stderr, "Usage: %s [options] cmd [arg...]\n\n", pname); fprintf(stderr, "Create a child process that executes a shell " "command in a new user namespace,\n" "and possibly also other new namespace(s).\n\n"); fprintf(stderr, "Options can be:\n\n"); #define fpe(str) fprintf(stderr, "%s", str); fpe("-i New IPC namespace\n"); fpe("-m New mount namespace\n"); fpe("-n New network namespace\n"); fpe("-p New PID namespace\n"); fpe("-u New UTS namespace\n"); fpe("-U New user namespace\n"); fpe("-M uid_map Specify UID map for user namespace\n"); fpe("-G gid_map Specify GID map for user namespace\n"); fpe("-z Map user's UID and GID to 0 in user namespace\n"); fpe("(equivalent to: -M '0 1' -G '0 1')\n"); fpe("-v Display verbose messages\n"); fpe("\n"); fpe("If -z, -M, or -G is specified, -U is req
Re: [systemd-devel] systemd-nspawn create container under unprivileged user
2015-02-05 12:44 GMT+03:00 Alban Crequy : > Manual page namespaces(7): > >Creation of new namespaces using clone(2) and unshare(2) in most > cases >requires the CAP_SYS_ADMIN capability. User namespaces are the >exception: since Linux 3.8, no privilege is required to create a > user >namespace. > So as i understand i can't create full featured container with network under non root user (and not have cap_sys_admin) -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] systemd-nspawn create container under unprivileged user
[reposting - sorry I forgot to Cc the mailing list] On 4 February 2015 at 23:03, Vasiliy Tolstov wrote: > Hello! > Does it possible to create container as regular user? Oh what capabilities i > need to add to create container not using root? Hello, Manual page namespaces(7): Creation of new namespaces using clone(2) and unshare(2) in most cases requires the CAP_SYS_ADMIN capability. User namespaces are the exception: since Linux 3.8, no privilege is required to create a user namespace. systemd-nspawn uses: src/nspawn/nspawn.c: pid = raw_clone(SIGCHLD|CLONE_NEWNS| (arg_share_system ? 0 : CLONE_NEWIPC|CLONE_NEWPID|CLONE_NEWUTS)| (arg_private_network ? CLONE_NEWNET : 0), NULL); So you need to have CAP_SYS_ADMIN to use systemd-nspawn. If you want to try user namespaces, it is something that is still moving... Manual page user_namespaces(7): Starting in Linux 3.8, unprivileged processes can create user namespaces, and mount, PID, IPC, network, and UTS namespaces can be created with just the CAP_SYS_ADMIN capability in the caller's user namespace. But it is not true in most Linux distributions as they disable unprivileged user namespaces and require CAP_SYS_ADMIN anyway. See for example: http://anonscm.debian.org/viewvc/kernel/dists/trunk/linux/debian/patches/debian/add-sysctl-to-disallow-unprivileged-CLONE_NEWUSER-by-default.patch?revision=20773&view=markup and: echo 1 > /proc/sys/kernel/unprivileged_userns_clone Additionally, the program userns_child_exec.c included in manual page namespaces(7) does not work as is yet because since the changes introduced by CVE-2014-8989, it needs to adjust /proc/pid/setgroups. See: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=66d2f338ee4c449396b6f99f5e75cd18eb6df272 Cheers, Alban ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] systemd-nspawn create container under unprivileged user
Hello! Does it possible to create container as regular user? Oh what capabilities i need to add to create container not using root? -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel