On Wed, Nov 6, 2013 at 4:35 PM, David Rientjes wrote:
> On Wed, 6 Nov 2013, Sameer Nanda wrote:
>
>> David -- I think we can make the duration that the tasklist_lock is
>> held smaller by consolidating the process selection logic that is
>> currently split across select_bad_process and
On Wed, Nov 6, 2013 at 4:35 PM, David Rientjes rient...@google.com wrote:
On Wed, 6 Nov 2013, Sameer Nanda wrote:
David -- I think we can make the duration that the tasklist_lock is
held smaller by consolidating the process selection logic that is
currently split across select_bad_process and
On Wed, 6 Nov 2013, Sameer Nanda wrote:
> David -- I think we can make the duration that the tasklist_lock is
> held smaller by consolidating the process selection logic that is
> currently split across select_bad_process and oom_kill_process into
> one place in select_bad_process. The
(adding back context from thread history)
On Tue, Nov 5, 2013 at 5:18 PM, David Rientjes wrote:
> On Tue, 5 Nov 2013, Sameer Nanda wrote:
>
>> The selection of the process to be killed happens in two spots -- first
>> in select_bad_process and then a further refinement by looking for
>> child
(adding back context from thread history)
On Tue, Nov 5, 2013 at 5:18 PM, David Rientjes rient...@google.com wrote:
On Tue, 5 Nov 2013, Sameer Nanda wrote:
The selection of the process to be killed happens in two spots -- first
in select_bad_process and then a further refinement by looking for
On Wed, 6 Nov 2013, Sameer Nanda wrote:
David -- I think we can make the duration that the tasklist_lock is
held smaller by consolidating the process selection logic that is
currently split across select_bad_process and oom_kill_process into
one place in select_bad_process. The tasklist_lock
Regarding other fixes: would it be possible to have the thread
iterator insert a dummy marker element in the thread list before the
scan? There would be one such dummy element per CPU, so that multiple
CPUs can scan the list in parallel. The loop would skip such
elements, and each dummy element
It's interesting that this was known for 3+ years, but nobody bothered
adding a small warning to the code.
We noticed this because it's actually happening on Chromebooks in the
field. We try to minimize OOM kills, but we can deal with them. Of
course, a hung kernel we cannot deal with.
On Tue,
Ccing Oleg and Sergey
On Tue, Nov 05, 2013 at 05:27:29PM -0800, David Rientjes wrote:
> On Tue, 5 Nov 2013, Luigi Semenzato wrote:
>
> > It's not enough to hold a reference to the task struct, because it can
> > still be taken out of the circular list of threads. The RCU
> > assumptions don't
On Tue, 5 Nov 2013, Luigi Semenzato wrote:
> It's not enough to hold a reference to the task struct, because it can
> still be taken out of the circular list of threads. The RCU
> assumptions don't hold in that case.
>
Could you please post a proper bug report that isolates this at the cause?
It's not enough to hold a reference to the task struct, because it can
still be taken out of the circular list of threads. The RCU
assumptions don't hold in that case.
On Tue, Nov 5, 2013 at 5:18 PM, David Rientjes wrote:
> On Tue, 5 Nov 2013, Sameer Nanda wrote:
>
>> The selection of the
On Tue, 5 Nov 2013, Sameer Nanda wrote:
> The selection of the process to be killed happens in two spots -- first
> in select_bad_process and then a further refinement by looking for
> child processes in oom_kill_process. Since this is a two step process,
> it is possible that the process
The selection of the process to be killed happens in two spots -- first
in select_bad_process and then a further refinement by looking for
child processes in oom_kill_process. Since this is a two step process,
it is possible that the process selected by select_bad_process may get a
SIGKILL just
The selection of the process to be killed happens in two spots -- first
in select_bad_process and then a further refinement by looking for
child processes in oom_kill_process. Since this is a two step process,
it is possible that the process selected by select_bad_process may get a
SIGKILL just
On Tue, 5 Nov 2013, Sameer Nanda wrote:
The selection of the process to be killed happens in two spots -- first
in select_bad_process and then a further refinement by looking for
child processes in oom_kill_process. Since this is a two step process,
it is possible that the process selected by
It's not enough to hold a reference to the task struct, because it can
still be taken out of the circular list of threads. The RCU
assumptions don't hold in that case.
On Tue, Nov 5, 2013 at 5:18 PM, David Rientjes rient...@google.com wrote:
On Tue, 5 Nov 2013, Sameer Nanda wrote:
The
On Tue, 5 Nov 2013, Luigi Semenzato wrote:
It's not enough to hold a reference to the task struct, because it can
still be taken out of the circular list of threads. The RCU
assumptions don't hold in that case.
Could you please post a proper bug report that isolates this at the cause?
Ccing Oleg and Sergey
On Tue, Nov 05, 2013 at 05:27:29PM -0800, David Rientjes wrote:
On Tue, 5 Nov 2013, Luigi Semenzato wrote:
It's not enough to hold a reference to the task struct, because it can
still be taken out of the circular list of threads. The RCU
assumptions don't hold in
It's interesting that this was known for 3+ years, but nobody bothered
adding a small warning to the code.
We noticed this because it's actually happening on Chromebooks in the
field. We try to minimize OOM kills, but we can deal with them. Of
course, a hung kernel we cannot deal with.
On Tue,
Regarding other fixes: would it be possible to have the thread
iterator insert a dummy marker element in the thread list before the
scan? There would be one such dummy element per CPU, so that multiple
CPUs can scan the list in parallel. The loop would skip such
elements, and each dummy element
20 matches
Mail list logo