Add to cc Lennart Poettering <lenn...@poettering.net>
On 01/23/2017 02:55 PM, Oleg Nesterov wrote:
On 01/22, Pavel Tikhomirov wrote:
Hmm. could you explain how this change helps CRIU? I mean, why
restorer can't do prctl(CHILD_SUBREAPER) before the first fork?
Imagine we have these tree in pidns:
1: has_child_subreaper == 0 && is_child_subreaper == 0
|-2: has_child_subreaper == 0 && is_child_subreaper == 1
| |-3: has_child_subreaper == 0 && is_child_subreaper == 0
| | |-5: has_child_subreaper == 0 && is_child_subreaper == 0
| |-4: has_child_subreaper == 1 && is_child_subreaper == 0
| | |-6: has_child_subreaper == 1 && is_child_subreaper == 0
before c/r: If 4 dies 6 will reparent to 2, if 3 dies 5 will reparent to 1.
after c/r: (where restorer had is_child_subreaper == 1, everybody in the
tree will have has_child_subreaper == 1) Everybody will reparent to 2.
This is clear, but this can only happen if 2 forks 3 and after that
sets is_child_subreaper, right?
And if someone actually does this then your patch can break this
application, no?
IOW. Currently CRIU can't restore the process tree with the same
has_child_subreaper bits if some process forks before
prctl(PR_SET_CHILD_SUBREAPER). It restores the tree as if prctl()
was called before the 1st fork.
So you change the semantics of PR_SET_CHILD_SUBREAPER and now CRIU
is fine simply because you remove this feature: the sub-reaper can
no longer pre-fork the children which should reparent to the previous
reaper.
I won't really argure, but I am not sure this is good idea...
If one task uses these feature now it must be very carefull: if some our
ancestor have enabled is_child_subreaper somewhere up the tree, forked
our tree and after that disabled is_child_subreaper, so we already have
has-flag and all children will inherit has-flag irrelevant to what is
our order of fork/prctl-ing to become subreaper.
And only one way to check if the task has no has_child_subreaper flag is
to create some childs kill them and see to where they will reparent, but
I doubt that someone is doing these now.
At least I think this should be clearly documented.
Yes, I surely need to add some documentation here. Thanks for mentioning
that! - Will do.
You don't need this new member and descendants_lock. task_struct has
the ->real_parent pointer so you can work the tree without recursion.
Sorry I don't get how I can walk down the tree of all descendants with help
of ->real_parent pointer, can you please point on some example or explain a
bit more? (I see task_is_descendant() in security/yama/yama_lsm.c but we
will need to check it for every process, not only descendants, the latter
can be a lot faster.)
I'll send a patch, probably a generic helper makes sense.
Btw task_is_descendant() looks wrong at first glance.
Oleg.
--
Best regards, Tikhomirov Pavel
Software Developer, Virtuozzo.