On Sat, Dec 12, 2015 at 09:12:41PM +0100, Jann Horn wrote:
> With this change, the entering process can first enter the
> namespace and then safely inspect the namespace's
> properties, e.g. through /proc/self/{uid_map,gid_map},
> assuming that the namespace owner doesn't have access to
> uid 0.

Actually, I think I missed something there. Well, at least it
should not directly lead to a container escape.


> -static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
> +static bool ptrace_has_cap(const struct cred *tcred, unsigned int mode)
>  {
> +     struct user_namespace *tns = tcred->user_ns;
> +     struct user_namespace *curns = current_cred()->user_ns;
> +
> +     /* When a root-owned process enters a user namespace created by a
> +      * malicious user, the user shouldn't be able to execute code under
> +      * uid 0 by attaching to the root-owned process via ptrace.
> +      * Therefore, similar to the capable_wrt_inode_uidgid() check,
> +      * verify that all the uids and gids of the target process are
> +      * mapped into the current namespace.
> +      * No fsuid/fsgid check because __ptrace_may_access doesn't do it
> +      * either.
> +      */
> +     if (!kuid_has_mapping(curns, tcred->euid) ||
> +                     !kuid_has_mapping(curns, tcred->suid) ||
> +                     !kuid_has_mapping(curns, tcred->uid)  ||
> +                     !kgid_has_mapping(curns, tcred->egid) ||
> +                     !kgid_has_mapping(curns, tcred->sgid) ||
> +                     !kgid_has_mapping(curns, tcred->gid))
> +             return false;
> +
>       if (mode & PTRACE_MODE_NOAUDIT)
> -             return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
> +             return has_ns_capability_noaudit(current, tns, CAP_SYS_PTRACE);
>       else
> -             return has_ns_capability(current, ns, CAP_SYS_PTRACE);
> +             return has_ns_capability(current, tns, CAP_SYS_PTRACE);
>  }

If the namespace owner can run code in the init namespace, the kuids are
mapped into curns but he is still capable wrt the target namespace.

I think a proper fix should first determine the highest parent of
tcred->user_ns in which the caller still has privs, then do the
kxid_has_mapping() checks in there.

Attachment: signature.asc
Description: Digital signature

Reply via email to