[Devel] Re: [PATCH 6/9] user namespaces: convert all capable checks in kernel/sys.c

2011-02-18 Thread Andrew Morton
On Thu, 17 Feb 2011 15:03:42 +
"Serge E. Hallyn"  wrote:

> @@ -1177,8 +1189,11 @@ SYSCALL_DEFINE2(sethostname, char __user *, name, int, 
> len)
>   int errno;
>   char tmp[__NEW_UTS_LEN];
>  
> - if (!ns_capable(current->nsproxy->uts_ns->user_ns, CAP_SYS_ADMIN))
> + if (!ns_capable(current->nsproxy->uts_ns->user_ns, CAP_SYS_ADMIN)) {
> + printk(KERN_NOTICE "%s: did not have CAP_SYS_ADMIN\n", 
> __func__);
>   return -EPERM;
> + }
> + printk(KERN_NOTICE "%s: did have CAP_SYS_ADMIN\n", __func__);
>   if (len < 0 || len > __NEW_UTS_LEN)
>   return -EINVAL;
>   down_write(&uts_sem);

Left over debugging printks?
___
Containers mailing list
contain...@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

___
Devel mailing list
Devel@openvz.org
https://openvz.org/mailman/listinfo/devel


[Devel] Re: [PATCH 9/9] userns: check user namespace for task->file uid equivalence checks

2011-02-18 Thread Andrew Morton
On Thu, 17 Feb 2011 15:04:07 +
"Serge E. Hallyn"  wrote:

> Cheat for now and say all files belong to init_user_ns.  Next
> step will be to let superblocks belong to a user_ns, and derive
> inode_userns(inode) from inode->i_sb->s_user_ns.  Finally we'll
> introduce more flexible arrangements.
> 
>
> ...
>
> +
> +/*
> + * return 1 if current either has CAP_FOWNER to the
> + * file, or owns the file.
> + */
> +int is_owner_or_cap(const struct inode *inode)
> +{
> + struct user_namespace *ns = inode_userns(inode);
> +
> + if (current_user_ns() == ns && current_fsuid() == inode->i_uid)
> + return 1;
> + if (ns_capable(ns, CAP_FOWNER))
> + return 1;
> + return 0;
> +}

bool?

> +EXPORT_SYMBOL(is_owner_or_cap);

There's a fairly well adhered to convention that global symbols (and
often static symbols) have a prefix which identifies the subsystem to
which they belong.  This patchset rather scorns that convention.

Most of these identifiers are pretty obviously from the capability
subsystem, but still...

>
> ...
>

___
Containers mailing list
contain...@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

___
Devel mailing list
Devel@openvz.org
https://openvz.org/mailman/listinfo/devel


[Devel] Re: [PATCH 6/9] user namespaces: convert all capable checks in kernel/sys.c

2011-02-18 Thread Andrew Morton
On Thu, 17 Feb 2011 15:03:42 +
"Serge E. Hallyn"  wrote:

> This allows setuid/setgid in containers.  It also fixes some
> corner cases where kernel logic foregoes capability checks when
> uids are equivalent.  The latter will need to be done throughout
> the whole kernel.
> 
>
> ...
>
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -118,17 +118,29 @@ EXPORT_SYMBOL(cad_pid);
>  
>  void (*pm_power_off_prepare)(void);
>  
> +/* called with rcu_read_lock, creds are safe */
> +static inline int set_one_prio_perm(struct task_struct *p)
> +{
> + const struct cred *cred = current_cred(), *pcred = __task_cred(p);
> +
> + if (pcred->user->user_ns == cred->user->user_ns &&
> + (pcred->uid  == cred->euid ||
> +  pcred->euid == cred->euid))
> + return 1;
> + if (ns_capable(pcred->user->user_ns, CAP_SYS_NICE))
> + return 1;
> + return 0;
> +}

uninline.  Document return value?

>
> ...
>

___
Containers mailing list
contain...@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

___
Devel mailing list
Devel@openvz.org
https://openvz.org/mailman/listinfo/devel


[Devel] Re: [PATCH 7/9] add a user namespace owner of ipc ns

2011-02-18 Thread Andrew Morton
On Thu, 17 Feb 2011 15:03:49 +
"Serge E. Hallyn"  wrote:

>
> ...
>
> --- a/include/linux/ipc_namespace.h
> +++ b/include/linux/ipc_namespace.h
> @@ -24,6 +24,7 @@ struct ipc_ids {
>   struct idr ipcs_idr;
>  };
>  
> +struct user_namespace;

Move to top of file.

>  struct ipc_namespace {
>   atomic_tcount;
>   struct ipc_ids  ids[3];
> @@ -56,6 +57,8 @@ struct ipc_namespace {
>   unsigned intmq_msg_max;  /* initialized to DFLT_MSGMAX */
>   unsigned intmq_msgsize_max;  /* initialized to DFLT_MSGSIZEMAX */
>  
> + /* user_ns which owns the ipc ns */
> + struct user_namespace *user_ns;
>  };
>  
>  extern struct ipc_namespace init_ipc_ns;
> diff --git a/ipc/msgutil.c b/ipc/msgutil.c
> index f095ee2..d91ff4b 100644
> --- a/ipc/msgutil.c
> +++ b/ipc/msgutil.c
> @@ -20,6 +20,8 @@
>  
>  DEFINE_SPINLOCK(mq_lock);
>  
> +extern struct user_namespace init_user_ns;

Should be declared in .h, not in .c.

>  /*
>   * The next 2 defines are here bc this is the only file
>   * compiled when either CONFIG_SYSVIPC and CONFIG_POSIX_MQUEUE
>
> ...
>

___
Containers mailing list
contain...@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

___
Devel mailing list
Devel@openvz.org
https://openvz.org/mailman/listinfo/devel


[Devel] Re: [PATCH 5/9] Allow ptrace from non-init user namespaces

2011-02-18 Thread Andrew Morton
On Thu, 17 Feb 2011 15:03:33 +
"Serge E. Hallyn"  wrote:

> ptrace is allowed to tasks in the same user namespace according to
> the usual rules (i.e. the same rules as for two tasks in the init
> user namespace).  ptrace is also allowed to a user namespace to
> which the current task the has CAP_SYS_PTRACE capability.
> 
>
> ...
>
> --- a/include/linux/capability.h
> +++ b/include/linux/capability.h
> @@ -546,6 +546,8 @@ extern const kernel_cap_t __cap_init_eff_set;
>   */
>  #define has_capability(t, cap) (security_real_capable((t), &init_user_ns, 
> (cap)) == 0)
>  
> +#define has_ns_capability(t, ns, cap) (security_real_capable((t), (ns), 
> (cap)) == 0)

macroitis.

>  /**
>   * has_capability_noaudit - Determine if a task has a superior capability 
> available (unaudited)
>   * @t: The task in question
> diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
> index faf4679..862fc59 100644
> --- a/include/linux/user_namespace.h
> +++ b/include/linux/user_namespace.h
> @@ -39,6 +39,9 @@ static inline void put_user_ns(struct user_namespace *ns)
>  uid_t user_ns_map_uid(struct user_namespace *to, const struct cred *cred, 
> uid_t uid);
>  gid_t user_ns_map_gid(struct user_namespace *to, const struct cred *cred, 
> gid_t gid);
>  
> +int same_or_ancestor_user_ns(struct task_struct *task,
> + struct task_struct *victim);

bool.

>  #else
>  
>  static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
>
> ...
>
> --- a/kernel/user_namespace.c
> +++ b/kernel/user_namespace.c
> @@ -129,6 +129,22 @@ gid_t user_ns_map_gid(struct user_namespace *to, const 
> struct cred *cred, gid_t
>   return overflowgid;
>  }
>  
> +int same_or_ancestor_user_ns(struct task_struct *task,
> + struct task_struct *victim)
> +{
> + struct user_namespace *u1 = task_cred_xxx(task, user)->user_ns;
> + struct user_namespace *u2 = task_cred_xxx(victim, user)->user_ns;
> + for (;;) {
> + if (u1 == u2)
> + return 1;
> + if (u1 == &init_user_ns)
> + return 0;
> + u1 = u1->creator->user_ns;
> + }
> + /* We never get here */
> + return 0;

Remove?

> +}
> +
>  static __init int user_namespaces_init(void)
>  {
>   user_ns_cachep = KMEM_CACHE(user_namespace, SLAB_PANIC);
>
> ...
>
>  int cap_ptrace_access_check(struct task_struct *child, unsigned int mode)
>  {
>   int ret = 0;
> + const struct cred *cred, *tcred;
>  
>   rcu_read_lock();
> - if (!cap_issubset(__task_cred(child)->cap_permitted,
> -   current_cred()->cap_permitted) &&
> - !capable(CAP_SYS_PTRACE))
> - ret = -EPERM;
> + cred = current_cred();
> + tcred = __task_cred(child);
> + /*
> +  * The ancestor user_ns check may be gratuitous, as I think
> +  * we've already guaranteed that in kernel/ptrace.c.
> +  */

?

> + if (same_or_ancestor_user_ns(current, child) &&
> + cap_issubset(tcred->cap_permitted, cred->cap_permitted))
> + goto out;
> + if (ns_capable(tcred->user->user_ns, CAP_SYS_PTRACE))
> + goto out;
> + ret = -EPERM;
> +out:
>   rcu_read_unlock();
>   return ret;
>  }
>
> ...
>

___
Containers mailing list
contain...@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

___
Devel mailing list
Devel@openvz.org
https://openvz.org/mailman/listinfo/devel


[Devel] Re: [PATCH 4/9] allow killing tasks in your own or child userns

2011-02-18 Thread Andrew Morton
On Thu, 17 Feb 2011 15:03:25 +
"Serge E. Hallyn"  wrote:

>  /*
> + * called with RCU read lock from check_kill_permission()
> + */
> +static inline int kill_ok_by_cred(struct task_struct *t)
> +{
> + const struct cred *cred = current_cred();
> + const struct cred *tcred = __task_cred(t);
> +
> + if (cred->user->user_ns == tcred->user->user_ns &&
> + (cred->euid == tcred->suid ||
> +  cred->euid == tcred->uid ||
> +  cred->uid  == tcred->suid ||
> +  cred->uid  == tcred->uid))
> + return 1;
> +
> + if (ns_capable(tcred->user->user_ns, CAP_KILL))
> + return 1;
> +
> + return 0;
> +}

The compiler will inline this for us.

> +/*
>   * Bad permissions for sending the signal
>   * - the caller must hold the RCU read lock
>   */
>  static int check_kill_permission(int sig, struct siginfo *info,
>struct task_struct *t)
>  {
> - const struct cred *cred, *tcred;
>   struct pid *sid;
>   int error;
>  
> @@ -656,14 +676,8 @@ static int check_kill_permission(int sig, struct siginfo 
> *info,
>   if (error)
>   return error;
>  
> - cred = current_cred();
> - tcred = __task_cred(t);
>   if (!same_thread_group(current, t) &&
> - (cred->euid ^ tcred->suid) &&
> - (cred->euid ^ tcred->uid) &&
> - (cred->uid  ^ tcred->suid) &&
> - (cred->uid  ^ tcred->uid) &&
> - !capable(CAP_KILL)) {
> + !kill_ok_by_cred(t)) {
>   switch (sig) {
>   case SIGCONT:
>   sid = task_session(t);

___
Containers mailing list
contain...@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

___
Devel mailing list
Devel@openvz.org
https://openvz.org/mailman/listinfo/devel


[Devel] Re: [PATCH 2/9] security: Make capabilities relative to the user namespace.

2011-02-18 Thread Andrew Morton
On Thu, 17 Feb 2011 15:03:06 +
"Serge E. Hallyn"  wrote:

> - Introduce ns_capable to test for a capability in a non-default
>   user namespace.
> - Teach cap_capable to handle capabilities in a non-default
>   user namespace.
> 
> The motivation is to get to the unprivileged creation of new
> namespaces.  It looks like this gets us 90% of the way there, with
> only potential uid confusion issues left.
> 
> I still need to handle getting all caps after creation but otherwise I
> think I have a good starter patch that achieves all of your goals.
> 
>
> ...
>
> --- a/include/linux/capability.h
> +++ b/include/linux/capability.h
> @@ -544,7 +544,7 @@ extern const kernel_cap_t __cap_init_eff_set;
>   *
>   * Note that this does not set PF_SUPERPRIV on the task.
>   */
> -#define has_capability(t, cap) (security_real_capable((t), (cap)) == 0)
> +#define has_capability(t, cap) (security_real_capable((t), &init_user_ns, 
> (cap)) == 0)
>  
>  /**
>   * has_capability_noaudit - Determine if a task has a superior capability 
> available (unaudited)
> @@ -558,9 +558,15 @@ extern const kernel_cap_t __cap_init_eff_set;
>   * Note that this does not set PF_SUPERPRIV on the task.
>   */
>  #define has_capability_noaudit(t, cap) \
> - (security_real_capable_noaudit((t), (cap)) == 0)
> + (security_real_capable_noaudit((t), &init_user_ns, (cap)) == 0)
>  
> +struct user_namespace;
> +extern struct user_namespace init_user_ns;

Two icky-should-be-written-in-C macros which reference init_user_ns,
followed by the declaration of init_user_ns and its type.  Declarations
which duplicate those in other header files.  It's ripe for some
upcleaning, methinks?

Also, please ensure that the forward struct declarations are all at
top-of-file (as in include/linux/security.h).  Otherwise we can end up
accumulating multiple forward declarations of the same thing in the one
file.

>  extern int capable(int cap);
> +extern int ns_capable(struct user_namespace *ns, int cap);
> +extern int task_ns_capable(struct task_struct *t, int cap);
> +
> +#define nsown_capable(cap) (ns_capable(current_user_ns(), (cap)))

macroitis!

> @@ -301,15 +302,42 @@ error:
>   */
>  int capable(int cap)
>  {
> + return ns_capable(&init_user_ns, cap);
> +}
> +EXPORT_SYMBOL(capable);
> +
> +/**
> + * ns_capable - Determine if the current task has a superior capability in 
> effect
> + * @ns:  The usernamespace we want the capability in
> + * @cap: The capability to be tested for
> + *
> + * Return true if the current task has the given superior capability 
> currently
> + * available for use, false if not.

Actually it doesn't return true or false - it returns 1 or 0.  Using a
`bool' return type would fix the comment :)

> + * This sets PF_SUPERPRIV on the task if the capability is available on the
> + * assumption that it's about to be used.
> + */
> +int ns_capable(struct user_namespace *ns, int cap)
> +{
>   if (unlikely(!cap_valid(cap))) {
>   printk(KERN_CRIT "capable() called with invalid cap=%u\n", cap);
>   BUG();
>   }
>  
> - if (security_capable(current_cred(), cap) == 0) {
> + if (security_capable(ns, current_cred(), cap) == 0) {
>   current->flags |= PF_SUPERPRIV;
>   return 1;
>   }
>   return 0;
>  }
> -EXPORT_SYMBOL(capable);
> +EXPORT_SYMBOL(ns_capable);
> +
> +/*
> + * does current have capability 'cap' to the user namespace of task
> + * 't'.  Return true if it does, false otherwise.
> + */

Other comments were kerneldocified.

> +int task_ns_capable(struct task_struct *t, int cap)
> +{
> + return ns_capable(task_cred_xxx(t, user)->user_ns, cap);
> +}
> +EXPORT_SYMBOL(task_ns_capable);

Could return bool.

>
> ...
>
> +int cap_capable(struct task_struct *tsk, const struct cred *cred,
> + struct user_namespace *targ_ns, int cap, int audit)
>  {
> - return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;
> + for (;;) {
> + /* The creator of the user namespace has all caps. */
> + if (targ_ns != &init_user_ns && targ_ns->creator == cred->user)
> + return 0;
> +
> + /* Do we have the necessary capabilities? */
> + if (targ_ns == cred->user->user_ns)
> + return cap_raised(cred->cap_effective, cap) ? 0 : 
> -EPERM;
> +
> + /* Have we tried all of the parent namespaces? */
> + if (targ_ns == &init_user_ns)
> + return -EPERM;
> +
> + /* If you have the capability in a parent user ns you have it
> +  * in the over all children user namespaces as well, so see
> +  * if this process has the capability in the parent user
> +  * namespace.
> +  */
> + targ_ns = targ_ns->creator->user_ns;
> + }
> +
> + /* We never get here */
> + return -EPERM;

So delete the code?  Or does the compiler warn?  If so, it's pretty busted.

>  }
>  
>  /*

[Devel] Re: [PATCH 1/9] Add a user_namespace as creator/owner of uts_namespace

2011-02-18 Thread Andrew Morton
On Thu, 17 Feb 2011 15:02:57 +
"Serge E. Hallyn"  wrote:

> +/*
> + * userns count is 1 for root user, 1 for init_uts_ns,
> + * and 1 for... ?
> + */

?
___
Containers mailing list
contain...@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

___
Devel mailing list
Devel@openvz.org
https://openvz.org/mailman/listinfo/devel


[Devel] Re: [PATCH 3/9] allow sethostname in a container

2011-02-18 Thread Daniel Lezcano
On 02/17/2011 04:03 PM, Serge E. Hallyn wrote:
> Signed-off-by: Serge E. Hallyn
> ---
Acked-by: Daniel Lezcano 
___
Containers mailing list
contain...@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

___
Devel mailing list
Devel@openvz.org
https://openvz.org/mailman/listinfo/devel


[Devel] Re: [PATCH 2/9] security: Make capabilities relative to the user namespace.

2011-02-18 Thread Daniel Lezcano
On 02/17/2011 04:03 PM, Serge E. Hallyn wrote:
> - Introduce ns_capable to test for a capability in a non-default
>user namespace.
> - Teach cap_capable to handle capabilities in a non-default
>user namespace.
>
> The motivation is to get to the unprivileged creation of new
> namespaces.  It looks like this gets us 90% of the way there, with
> only potential uid confusion issues left.
>
> I still need to handle getting all caps after creation but otherwise I
> think I have a good starter patch that achieves all of your goals.
>
> Changelog:
>   11/05/2010: [serge] add apparmor
>   12/14/2010: [serge] fix capabilities to created user namespaces
>   Without this, if user serge creates a user_ns, he won't have
>   capabilities to the user_ns he created.  THis is because we
>   were first checking whether his effective caps had the caps
>   he needed and returning -EPERM if not, and THEN checking whether
>   he was the creator.  Reverse those checks.
>   12/16/2010: [serge] security_real_capable needs ns argument in 
> !security case
>   01/11/2011: [serge] add task_ns_capable helper
>   01/11/2011: [serge] add nsown_capable() helper per Bastian Blank 
> suggestion
>   02/16/2011: [serge] fix a logic bug: the root user is always creator of
>   init_user_ns, but should not always have capabilities to
>   it!  Fix the check in cap_capable().
>
> Signed-off-by: Eric W. Biederman
> Signed-off-by: Serge E. Hallyn
> ---

Acked-by: Daniel Lezcano 

___
Containers mailing list
contain...@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

___
Devel mailing list
Devel@openvz.org
https://openvz.org/mailman/listinfo/devel


[Devel] Re: [PATCH] Reduce uidhash lock hold time when lookup succeeds

2011-02-18 Thread Serge E. Hallyn
Quoting Matt Helsley (matth...@us.ibm.com):
> When lookup succeeds we don't need the "new" user struct which hasn't
> been linked into the uidhash. So we can immediately drop the lock and
> then free "new" rather than free it with the lock held.
> 
> Signed-off-by: Matt Helsley 
> Cc: David Howells 
> Cc: Pavel Emelyanov 
> Cc: Alexey Dobriyan 
> Cc: "Serge E. Hallyn" 

Acked-by: Serge E. Hallyn 

And might I say that the label 'out_unlock' in that function is
sadly named :)

> Cc: contain...@lists.linux-foundation.org
> ---
>  kernel/user.c |   12 +++-
>  1 files changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/user.c b/kernel/user.c
> index 5c598ca..4ea8e58 100644
> --- a/kernel/user.c
> +++ b/kernel/user.c
> @@ -157,16 +157,18 @@ struct user_struct *alloc_uid(struct user_namespace 
> *ns, uid_t uid)
>*/
>   spin_lock_irq(&uidhash_lock);
>   up = uid_hash_find(uid, hashent);
> - if (up) {
> + if (!up) {
> + uid_hash_insert(new, hashent);
> + up = new;
> + }
> + spin_unlock_irq(&uidhash_lock);
> +
> + if (up != new) {
>   put_user_ns(ns);
>   key_put(new->uid_keyring);
>   key_put(new->session_keyring);
>   kmem_cache_free(uid_cachep, new);
> - } else {
> - uid_hash_insert(new, hashent);
> - up = new;
>   }
> - spin_unlock_irq(&uidhash_lock);
>   }
>  
>   return up;
> -- 
> 1.6.3.3
> 
> ___
> Containers mailing list
> contain...@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/containers
___
Containers mailing list
contain...@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

___
Devel mailing list
Devel@openvz.org
https://openvz.org/mailman/listinfo/devel


[Devel] Re: [PATCH 1/9] Add a user_namespace as creator/owner of uts_namespace

2011-02-18 Thread Daniel Lezcano
On 02/17/2011 04:02 PM, Serge E. Hallyn wrote:
> copy_process() handles CLONE_NEWUSER before the rest of the
> namespaces.  So in the case of clone(CLONE_NEWUSER|CLONE_NEWUTS)
> the new uts namespace will have the new user namespace as its
> owner.  That is what we want, since we want root in that new
> userns to be able to have privilege over it.
>
> Changelog:
>   Feb 15: don't set uts_ns->user_ns if we didn't create
>   a new uts_ns.
>
> Signed-off-by: Serge E. Hallyn

Acked-by: Daniel Lezcano 

A couple of comments.

> ---
>   include/linux/utsname.h |3 +++
>   init/version.c  |2 ++
>   kernel/nsproxy.c|5 +
>   kernel/user.c   |8 ++--
>   kernel/utsname.c|4 
>   5 files changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/utsname.h b/include/linux/utsname.h
> index 69f3997..85171be 100644
> --- a/include/linux/utsname.h
> +++ b/include/linux/utsname.h
> @@ -37,9 +37,12 @@ struct new_utsname {
>   #include
>   #include
>
> +struct user_namespace;
> +
>   struct uts_namespace {
>   struct kref kref;
>   struct new_utsname name;
> + struct user_namespace *user_ns;
>   };
>   extern struct uts_namespace init_uts_ns;
>
> diff --git a/init/version.c b/init/version.c
> index adff586..97bb86f 100644
> --- a/init/version.c
> +++ b/init/version.c
> @@ -21,6 +21,7 @@ extern int version_string(LINUX_VERSION_CODE);
>   int version_string(LINUX_VERSION_CODE);
>   #endif
>
> +extern struct user_namespace init_user_ns;
>   struct uts_namespace init_uts_ns = {
>   .kref = {
>   .refcount   = ATOMIC_INIT(2),
> @@ -33,6 +34,7 @@ struct uts_namespace init_uts_ns = {
>   .machine= UTS_MACHINE,
>   .domainname = UTS_DOMAINNAME,
>   },
> + .user_ns =&init_user_ns,
>   };
>   EXPORT_SYMBOL_GPL(init_uts_ns);
>
> diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
> index f74e6c0..034dc2e 100644
> --- a/kernel/nsproxy.c
> +++ b/kernel/nsproxy.c
> @@ -74,6 +74,11 @@ static struct nsproxy *create_new_namespaces(unsigned long 
> flags,
>   err = PTR_ERR(new_nsp->uts_ns);
>   goto out_uts;
>   }
> + if (new_nsp->uts_ns != tsk->nsproxy->uts_ns) {
> + put_user_ns(new_nsp->uts_ns->user_ns);
> + new_nsp->uts_ns->user_ns = task_cred_xxx(tsk, user)->user_ns;
> + get_user_ns(new_nsp->uts_ns->user_ns);
> + }

IMO you should add a comment telling this code assume create_user_ns was 
called before (via copy_cred).

>
>   new_nsp->ipc_ns = copy_ipcs(flags, tsk->nsproxy->ipc_ns);
>   if (IS_ERR(new_nsp->ipc_ns)) {

[ ... ]

>   static struct uts_namespace *create_uts_ns(void)
>   {
> @@ -40,6 +41,8 @@ static struct uts_namespace *clone_uts_ns(struct 
> uts_namespace *old_ns)
>
>   down_read(&uts_sem);
>   memcpy(&ns->name,&old_ns->name, sizeof(ns->name));
> + ns->user_ns = old_ns->user_ns;
> + get_user_ns(ns->user_ns);

ns->user_ns = get_user_ns(old_ns->user_ns);

___
Containers mailing list
contain...@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

___
Devel mailing list
Devel@openvz.org
https://openvz.org/mailman/listinfo/devel


[Devel] Re: [PATCH 2/2] pidns: Support unsharing the pid namespace.

2011-02-18 Thread Oleg Nesterov
On 02/17, Greg Kurz wrote:
>
> On 02/17/2011 09:29 PM, Oleg Nesterov wrote:
>> On 02/17, Daniel Lezcano wrote:
>>>
>>> On 02/15/2011 08:01 PM, Oleg Nesterov wrote:

 I have to admit, I can't say I like this very much. OK, if we need
 this, can't we just put something into, say, signal->flags so that
 copy_process can check and create the new namespace.

 Also. I remember, I already saw something like this and google found
 my questions. I didn't actually read the new version, perhaps my
 concerns were already answered...

But what if the task T does unshare(CLONE_NEWPID) and then, say,
pthread_create() ? Unless I missed something, the new thread won't
be able to see T ?
>>>
>>> Right. Is it really a problem ? I mean it is a weird use case where we
>>> fall in a weird situation.
>>
>> But this is really weird! How it is possible that the parent can't see
>> its own child? No matter which thread did fork(), the new process is
>
> Hmmm... I guess you mean the opposite. The way pid namespaces are
> nested, parents always see their children.

Well, yes. But it can't see this child using the same pid number,
unless I missed something.

> But indeed, the child thread
> can't see its group leader and that's kind of unusual.

This too. And to me this is more "kind of buggy". But yes, I am
biased because I dislike this approach in general ;)

And, once again, this patch also lacks the necessary s/nsproxy/atcive_pid_ns/
changes.

Anyway. It is very possible I missed something. As I said, I didn't
actually read this version and I forgot all I knew about this change
before.

But afaics this patch is buggy in its current form.

Oleg.

___
Containers mailing list
contain...@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

___
Devel mailing list
Devel@openvz.org
https://openvz.org/mailman/listinfo/devel