On Sat, Apr 17, 2021 at 03:04:34PM -0500, Serge Hallyn wrote:
> A process running as uid 0 but without cap_setfcap currently can simply
> unshare a new user namespace with uid 0 mapped to 0.  While this task
> will not have new capabilities against the parent namespace, there is
> a loophole due to the way namespaced file capabilities work.  File
> capabilities valid in userns 1 are distinguised from file capabilities
> valid in userns 2 by the kuid which underlies uid 0.  Therefore
> the restricted root process can unshare a new self-mapping namespace,
> add a namespaced file capability onto a file, then use that file
> capability in the parent namespace.
> 
> To prevent that, do not allow mapping uid 0 if the process which
> opened the uid_map file does not have CAP_SETFCAP, which is the capability
> for setting file capabilities.
> 
> A further wrinkle:  a task can unshare its user namespace, then
> open its uid_map file itself, and map (only) its own uid.  In this
> case we do not have the credential from before unshare,  which was
> potentially more restricted.  So, when creating a user namespace, we
> record whether the creator had CAP_SETFCAP.  Then we can use that
> during map_write().
> 
> With this patch:
> 
> 1. unprivileged user can still unshare -Ur
> 
> ubuntu@caps:~$ unshare -Ur
> root@caps:~# logout
> 
> 2. root user can still unshare -Ur
> 
> ubuntu@caps:~$ sudo bash
> root@caps:/home/ubuntu# unshare -Ur
> root@caps:/home/ubuntu# logout
> 
> 3. root user without CAP_SETFCAP cannot unshare -Ur:
> 
> root@caps:/home/ubuntu# /sbin/capsh --drop=cap_setfcap --
> root@caps:/home/ubuntu# /sbin/setcap cap_setfcap=p /sbin/setcap
> unable to set CAP_SETFCAP effective capability: Operation not permitted
> root@caps:/home/ubuntu# unshare -Ur
> unshare: write failed /proc/self/uid_map: Operation not permitted
> 
> Signed-off-by: Serge Hallyn <se...@hallyn.com>
> 
> Changelog:
>    * fix logic in the case of writing to another task's uid_map
>    * rename 'ns' to 'map_ns', and make a file_ns local variable
>    * use /* comments */
>    * update the CAP_SETFCAP comment in capability.h
>    * rename parent_unpriv to parent_can_setfcap (and reverse the
>      logic)
>    * remove printks
>    * clarify (i hope) the code comments
>    * update capability.h comment
>    * renamed parent_can_setfcap to parent_could_setfcap
>    * made the check its own disallowed_0_mapping() fn
>    * moved the check into new_idmap_permitted
>    * rename disallowed_0_mapping to verify_root_mapping
>    * change verify_root_mapping to Christian's suggested flow
> ---

Thank you. This looks good. I tested this with:

- fstests
- LXD testsuite
- Podman testsuite
- libcap testsuite

Tested-by: Christian Brauner <christian.brau...@ubuntu.com>
Reviewed-by: Christian Brauner <christian.brau...@ubuntu.com>

Reply via email to