Re: [PATCH 2/2] exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting

Eric W. Biederman Tue, 25 Nov 2014 09:52:59 -0800

Oleg Nesterov <[email protected]> writes:

> On 11/24, Eric W. Biederman wrote:
>>
>> Oleg Nesterov <[email protected]> writes:
>>
>> > --- a/kernel/pid.c
>> > +++ b/kernel/pid.c
>> > @@ -320,7 +320,6 @@ struct pid *alloc_pid(struct pid_namespace *ns)
>> >                    goto out_free;
>> >    }
>> >
>> > -  get_pid_ns(ns);
>> >    atomic_set(&pid->count, 1);
>> >    for (type = 0; type < PIDTYPE_MAX; ++type)
>> >            INIT_HLIST_HEAD(&pid->tasks[type]);
>> > @@ -336,7 +335,7 @@ struct pid *alloc_pid(struct pid_namespace *ns)
>> >    }
>> >    spin_unlock_irq(&pidmap_lock);
>> >
>> > -out:
>> > +  get_pid_ns(ns);
>>
>> Moving the label and changing the goto out logic is gratuitous confusing
>> and I think it probably even generates worse code.
>>
>> Furthermore multiple exits make adding debugging code more difficult.
>
> Oh, I strongly disagree but I am not going to argue ;) cleanups are
> always subjective, and I do believe in "maintainer is always right"
> mantra. I can make v2 without this change.


Fair enough.  My primary complaint was that you were changing the logic
and fixing a bug at the same time.  That added noise and made analysis
of what was really going on much more difficult.

>> Moving get_pid_ns down does close a leak in the error handling path.
>
> OK, good.
>
>> However at the moment my I can't figure out if it is safe to move
>> get_pid_ns elow hlist_add_head_rcu.  Because once we are on the rcu list
>> the pid is findable, and being publicly visible with a bad refcount could 
>> cause
>> problems.
>
> The caller has a reference, this ns can't go away. Obviously, otherwise
> get_pid_ns(ns) is not safe.
>
> We need this get_pid_ns() to balance put_pid()->put_pid_ns() which obviously
> won't be called until we return this pid, otherwise everything is wrong.
>
> So I think this should be safe?

My concern is exposing a half initialized struct pid to the world via an
rcu data structure.  In particular could one of the rcu users get into
trouble because we haven't called get_pid_ns yet?  That is unclear to me.

That is one of those weird nasty races I would rather not have to
consider and moving the get_pid_ns after hlist_add requires that we
think about it.

To fix the error handling and avoid thinking about the races we have two
choices:
- In the error path that is currently called out_unlock we can drop the
  extra references.
- Immediately after we perform the test that on error jumps to out_unlock
  we call get_pid_ns.

My preference would be the first, as it is a trivially correct one line
change.

Aka I think this is the obviously correct trivial fix.

 out_unlock:
        spin_unlock_irq(&pidmap_lock);
+       put_pid_ns(ns);
 out_free:
        while (++i <= ns->level)
                free_pidmap(pid->numbers + i);
 


Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting

Reply via email to