alloc_pidmap() advances pid_namespace::last_pid. When first pid allocation
fails, then next created process will have pid 2 and pid_ns_prepare_proc()
won't be called. So, pid_namespace::proc_mnt will never be initialized
(not to mention that there won't be a child reaper).

I saw crash stack of such case on kernel 3.10:

    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: [<ffffffff8126b18f>] proc_flush_task+0x8f/0x1b0
    Call Trace:
        [<ffffffff810807ff>] release_task+0x3f/0x490
        [<ffffffff810c0570>] ? thread_group_cputime_adjusted+0x50/0x70
        [<ffffffff8108144f>] wait_consider_task.part.10+0x7ff/0xb00
        [<ffffffff8108186f>] do_wait+0x11f/0x280
        [<ffffffff81082b2d>] SyS_wait4+0x7d/0x110

We may fix this by restore of last_pid in 0 or by prohibiting of
futher allocations. Since there was a similar issue in commit 314a8ad0f18a
by Oleg Nesterov <[email protected]>:
        "pidns: fix free_pid() to handle the first fork failure".
and it was fixed via prohibiting allocation, let's follow this way,
and do the same.

Signed-off-by: Kirill Tkhai <[email protected]>
Acked-by: Cyrill Gorcunov <[email protected]>
---
 kernel/pid.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/pid.c b/kernel/pid.c
index 0143ac0ddceb..fd1cde1e4576 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -321,8 +321,10 @@ struct pid *alloc_pid(struct pid_namespace *ns)
        }
 
        if (unlikely(is_child_reaper(pid))) {
-               if (pid_ns_prepare_proc(ns))
+               if (pid_ns_prepare_proc(ns)) {
+                       disable_pid_allocation(ns);
                        goto out_free;
+               }
        }
 
        get_pid_ns(ns);

Reply via email to