Re: Threads stuck in zap_pid_ns_processes()

Eric W. Biederman Fri, 12 May 2017 06:33:28 -0700

Vovo Yang <[email protected]> writes:

> On Fri, May 12, 2017 at 7:19 AM, Eric W. Biederman
> <[email protected]> wrote:
>> Guenter Roeck <[email protected]> writes:
>>
>>> What I know so far is
>>> - We see this condition on a regular basis in the field. Regular is
>>>   relative, of course - let's say maybe 1 in a Milion Chromebooks
>>>   per day reports a crash because of it. That is not that many,
>>>   but it adds up.
>>> - We are able to reproduce the problem with a performance benchmark
>>>   which opens 100 chrome tabs. While that is a lot, it should not
>>>   result in a kernel hang/crash.
>>> - Vovo proviced the test code last night. I don't know if this is
>>>   exactly what is observed in the benchmark, or how it relates to the
>>>   benchmark in the first place, but it is the first time we are actually
>>>   able to reliably create a condition where the problem is seen.
>>
>> Thank you.  I will be interesting to hear what is happening in the
>> chrome perfomance benchmark that triggers this.
>>
> What's happening in the benchmark:
> 1. A chrome renderer process was created with CLONE_NEWPID
> 2. The process crashed
> 3. Chrome breakpad service calls ptrace(PTRACE_ATTACH, ..) to attach to every
>   threads of the crashed process to dump info
> 4. When breakpad detach the crashed process, the crashed process stuck in
>   zap_pid_ns_processes()


Very interesting thank you.

So the question is specifically which interaction is causing this.

In the test case provided it was a sibling task in the pid namespace
dying and not being reaped.  Which may be what is happening with
breakpad.  So far I have yet to see kernel bug but I won't rule one out.

Eric

Re: Threads stuck in zap_pid_ns_processes()

Reply via email to