On Tue, Oct 20, 2015 at 10:34 AM, Dmitry Vyukov <dvyu...@google.com> wrote:
> On Mon, Oct 19, 2015 at 10:17 PM, Dmitry Vyukov <dvyu...@google.com> wrote:
>> On Mon, Oct 19, 2015 at 9:49 PM, Oleg Nesterov <o...@redhat.com> wrote:
>>> On 10/19, Dmitry Vyukov wrote:
>>>>
>>>> The following program hangs in some interesting state and is not
>>>> killable (started by a normal user, not root):
>>>
>>> Thanks.
>>>
>>>> #include <pthread.h>
>>>> #include <unistd.h>
>>>> #include <sys/ptrace.h>
>>>> #include <stdio.h>
>>>> #include <signal.h>
>>>>
>>>> void *thr(void *arg) {
>>>>         ptrace(PTRACE_TRACEME, 0, 0, 0);
>>>>         sleep(3);
>>>>         kill(getpid(), SIGCHLD);
>>>>         return 0;
>>>> }
>>>>
>>>> int main() {
>>>>         if (fork() == 0) {
>>>>                 sleep(1);
>>>>                 pthread_t th;
>>>>                 pthread_create(&th, 0, thr, 0);
>>>>                 sleep(1);
>>>>         }
>>>>         return 0;
>>>> }
>>>>
>>>>
>>>> The child process attaches as tracee to init process
>>>
>>> Yes, although in a racy manner, the parent can exit after
>>> PTRACE_TRACEME in this case the kernel will untrace the task
>>> before reparenting. Not that this matters.
>>>
>>>> and then hangs in
>>>> a state that I don't understand. When I did a similar thing but
>>>> attached it to a normal parent process (shell), I still was able to
>>>> get rid of it by killing parent (shell).
>>>
>>> See above.
>>>
>>> So I bet the problem is that your /sbin/init doesn't use __WALL,
>>> so wait() doesn't reap the traced zombie sub-thread, and thus it
>>> can't release the non-empty thread group.
>>>
>>> Could you please verify? Just do "strace -p1" and send SIGCHLD to
>>> init.
>>>
>>> perhaps eligible_child() should assume WALL if ptrace && ZOMBIE...
>>
>>
>> I am using Ubuntu.
>> Here strace output from init:
>>
>> waitid(P_ALL, 0, {}, WNOHANG|WEXITED|WSTOPPED|WCONTINUED, NULL) = 0
>>
>> So what should be fixed here? Kernel of distro init?
>
> waitpid(__WALL) indeed joins these processes.
> But __WALL can't be used with waitid and Ubuntu init uses waitid...




I am thinking how to workaround this issue.

The following program joins both child processes:

#include <pthread.h>
#include <unistd.h>
#include <sys/ptrace.h>
#include <stdio.h>
#include <errno.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/wait.h>

void *thr(void *arg) {
        ptrace(PTRACE_TRACEME, 0, 0, 0);
        return 0;
}

int main() {
int pid = fork();
        if (pid == 0) {
                pthread_t th;
                pthread_create(&th, 0, thr, 0);
                sleep(1);
                return 0;
        }
        siginfo_t info = {};
        int status = 0;
        int res = waitpid(-1, &status, __WALL);
        printf("pid=%d res=%d errno=%d\n", pid, res, errno);
        res = waitpid(-1, &status, __WALL);
        printf("pid=%d res=%d errno=%d\n", pid, res, errno);
        return 0;
}


However, I need to wait for a particular child and if I change the
first waitpid to:

        int res = waitpid(pid, &status, __WALL);

then it does not terminate.
So how can I wait for such child process?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to