On Wed, Jul 31, 2019 at 07:41:36PM +0200, Oleg Nesterov wrote: > On 07/31, Adrian Reber wrote: > > > > Extending clone3() to support CLONE_SET_TID makes it possible restore a > > process using CRIU without accessing /proc/sys/kernel/ns_last_pid and > > race free (as long as the desired PID/TID is available). > > I personally like this... but please see the question below. > > > +struct pid *alloc_pid(struct pid_namespace *ns, int set_tid) > > { > > struct pid *pid; > > enum pid_type type; > > @@ -186,12 +186,28 @@ struct pid *alloc_pid(struct pid_namespace *ns) > > if (idr_get_cursor(&tmp->idr) > RESERVED_PIDS) > > pid_min = RESERVED_PIDS; > > > > - /* > > - * Store a null pointer so find_pid_ns does not find > > - * a partially initialized PID (see below). > > - */ > > - nr = idr_alloc_cyclic(&tmp->idr, NULL, pid_min, > > - pid_max, GFP_ATOMIC); > > + if (set_tid) { > > + /* > > + * Also fail if a PID != 1 is requested > > + * and no PID 1 exists. > > + */ > > + if ((set_tid >= pid_max) || ((set_tid != 1) && > > + (idr_get_cursor(&tmp->idr) <= 1))) > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > Ah, I forgot to mention... this should work but only because > RESERVED_PIDS > 0. How about idr_is_empty() ? > > > But the main question is how it can really help if ns->level > 0, unlikely > CRIU will ever need to clone the process with the same pid_nr == set_tid > in the ns->parent chain.
Not sure I understand what you mean. For CRIU only the PID in the PID namespace is relevant. > So may be kernel_clone_args->set_tid should be pid_t __user *set_tid_array? > Or I missed something ? Not sure why and how an array would be needed. Could you give me some more details why you think this is needed. Adrian